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ABSTRACT 


Bauer,  Kenneth  W.,  Jr.  Ph.D.,  Purdue  University,  May  1987. 

Control  Variate  Selection  for  Multiresponse  Simulation. 

Major  Professor:  James  R.  Wilson. 

A  solution  is  offered  to  the  general  problem  of  optimal  selection  of 
control  variates.  Solutions  are  offered  for  two  different  cases  of  the  general 

problem:  (a^when  the  covariance  matrix  of  the  controls  is  unknown,  and  (b)  _ 

* 

when  the  covariance  matrix  of  the  controls  is  known  and  is  incorporated 
into  point  and  confidence  region  estimators.  For  the  second  case  a  new 
estimator  is  introduced.  Under  the  assumption  that  the  responses  and  the 
controls  are  jointly  normal,  the  unbiasness  of  this  new  estimator  is 
established  ,  and  its  dispersion  matrix  is  derived.  A  selection  algorithm  is 
implemented  which  locates  the  optimal  subset  of  controls.  The  algorithm  is 
based  on  criteria  derived  for  the  two  cases  listed  above.  A  promising  new 
class  of  controls  is  introduced  which  are  called  ^routing  variables”.  The 
asymptotic  distribution  of  these  controls  is  derived  as  well  as  their 
asymptotic  mean  and  variance.  Finally,  the  performance  of  the  selection 
algorithm  is  investigated  and  the  new  estimator  is  contrasted  with  the 


classical  estimator. 
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ABSTRACT 

Bauer,  Kenneth  W.,  Jr.  Ph.D.,  Purdue  University,  May  1987. 

Control  Variate  Selection  for  Multiresponse  Simulation. 

Major  Professor:  James  R.  Wilson. 

A  solution  is  offered  to  the  general  problem  of  optimal  selection  of 
control  variates.  Solutions  are  offered  for  two  different  cases  of  the  general 
problem:  (a)  when  the  covariance  matrix  of  the  controls  is  unknown,  and  (b) 
when  the  covariance  matrix  of  the  controls  is  known  and  is  incorporated 
into  point  and  confidence  region  estimators.  For  the  second  case  a  new 
estimator  is  introduced.  Under  the  assumption  that  the  responses  and  the 
controls  re  jointly  normal,  the  unbiasness  of  this  new  estimator  is 
established  ,  and  its  dispersion  matrix  is  derived.  A  selection  algorithm  is 
implemented  which  locates  the  optimal  subset  of  controls.  The  algorithm  is 
based  on  criteria  derived  for  the  two  cases  listed  above.  A  promising  new 
class  of  controls  is  introduced  which  are  called  “routing  variables”.  The 
asymptotic  distribution  of  these  controls  is  derived  as  well  as  their 
asymptotic  mean  and  variance.  Finally,  the  performance  of  the  selection 
algorithm  is  investigated  and  the  new  estimator  is  contrasted  with  the 


classical  estimator. 


CHAPTER  1 


INTRODUCTION 


The  method  of  control  variables  is  one  of  the  main  variance  reduction 
techniques  used  in  discrete  event  simulation.  This  method  attempts  to  exploit 
correlations  between  output  responses  and  associated  auxiliary  variables  with 
known  means  that  can  be  observed  during  the  course  of  a  simulation  run. 
Although  control  variables  can  be  external  (that  is,  similar  variables  in  a 
much  simplified  version  of  the  original  model  which  is  driven  by  the  same 
random  number  streams  as  the  original  model),  our  research  deals  only  with 
internal  or  so-called  concomitant  controls. 

There  are  several  tactical  issues  that  must  be  addressed  to  employ 
control  variables  successfully.  One  such  issue  is  efficiency.  Several  authors 
(see  review  of  the  literature)  have  addressed  the  fact  that  a  trade-off  must  be 
recognized  in  the  application  of  the  control  variable  technique.  Various  loss 
factors  have  been  derived  to  quantify  the  diminishing  marginal  returns  that 
are  experienced  (on  the  average)  when  additional  control  variables  are 
included  in  the  variance  reduction  scheme.  This  trade-off  arises  because  the 
application  of  control  variables  requires  the  estimation  of  additional  control 
coefficients.  If  the  sample  size  is  taken  to  be  constant,  then  the  variance 
reduction  achieved  by  the  use  of  additional  controls  can  be  offset  by  the 
variance  inflation  due  to  the  estimation  of  additional  coefficients.  Hence  a 


selection  scheme  is  needed  to  pick  a  good  subset  of  the  candidate  controls. 


The  control  variable  selection  problem  is  important  because  more  often 
than  not  a  simulator  trying  to  use  control  variables  finds  himself  confronted 
with  multiple  candidates  for  controls.  The  literature  to  date  only  offers  ad 
hoc  methods  to  solve  this  problem.  Several  authors  have  called  for  research 
into  this  problem;  in  particular  Lavenberg,  Moeller,  and  Welch  (1982), 
Rubinstein  and  Marcus  (1985),  and  Venkatraman  and  Wilson  (1986)  have  all 
suggested  methods  for  developing  an  effective  control  variate  selection 
procedure.  Unfortunately  there  has  been  no  follow-up  work  on  any  of  these 
proposals. 

1.1  Research  Objectives 

Our  primary  objective  is  to  formulate  and  evaluate  control  variate 
selection  criteria  for  multiresponse  simulation  experiments  in  which  we  seek 
point  and  confidence  region  estimators  for  the  mean  response.  We 
distinguish  the  following  cases:  (a)  the  covariance  matrix  of  controls  is 
unknown,  and  (b)  the  covariance  matrix  of  the  controls  is  known  and  is 
incorporated  into  the  point  and  confidence  region  estimator.  The  second  case 
requires  the  introduction  of  a  new  point  estimator  and  the  the  derivation  of 
its  mean  vector  and  covariance  matrix.  We  also  introduce  a  new  class  of 
controls  that  we  call  "routing  variables”,  and  we  establish  the  asymptotic 
distribution  of  these  controls  so  that  we  may  exploit  them  not  only  in  case 
(a)  but  also  in  case  (b).  The  experimental  evaluation  phase  of  the  research 
includes  a  comparison  of  the  performance  of  the  controlled  estimation 
procedures  described  in  (a)  and  (b)  above. 
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1.2  Organization  of  the  Research 

We  review  both  the  control  variable  literature  as  well  as  the  pertinent 
statistical  literature  which  bears  on  variable  selection  in  the  context  of  linear 
regression.  The  literature  review  is  presented  in  Chapter  2.  Chapter  3 
presents  theoretical  arguments  which  lead  to  selection  criteria  for  both  cases 
mentioned  above.  In  Chapter  3  we  also  derive  the  properties  of  the  new 

_  a 

estimator  Y(~i).  Chapter  4  describes  our  experimental  setup.  We  discuss  the 
models  used,  the  experimental  layout,  and  the  necessary  matrix  methods 
required  to  implement  the  selection  algorithm.  In  Chapter  5  we  summarize 
the  results  of  our  experiments.  In  Chapter  6  we  present  an  overview  of  the 
research  and  propose  directions  for  future  research. 


v/ 


CHAPTER  2 


LITERATURE  REVIEW 


The  first  notable,  comprehensive  discussion  of  control  variables  (actually 
variance  reduction  techniques  in  general)  is  offered  by  Kleijnen  (1974).  A 
more  rigorous,  up  to  date  survey  of  variance  reduction  techniques  is  found  in 
Wilson  (1984). 

2.1  Univariate  Simulation  Response  with  a  Single  Control 

Assume  Y  is  an  estimator  of  [iy ,  where  nY  is  the  mean  of  some  response 
of  interest.  Let  X  be  a  variable  observed  during  the  course  of  the  simulation. 
We  assume  that  X  is  highly  correlated  with  the  response,  and  further  that  its 
mean  13  known.  The  variable  X  is  the  control  variable. 

Consider  the  “controlled  estimator” 

Y(b)  -y  -b(X-nx)  (2.1.1) 


Note,  if  b  is  a  constant 


E{Y(b ))  —  nY  , 


(2.1.2) 


o 


and  var(Y(6))  =  var(Y)  +  62var(X)  -  26cov(Y,X)  .  (2.1.3) 


So  Y(b)  is  an  unbiased  estimator  of  py .  The  variance  of  Y(b)  will  be  smaller 
than  the  variance  of  Y  if 

2b  cov(Y ,  X )  >  62var(X)  .  (2.1.4) 


A  little  calculus  reveals  that 


cov(Y  ,X) 
var(X) 


(2.1.5) 


minimizes  (2.1.3).  Plugging  (2.1.5)  into  (2.1.3)  yields  the  minimum  variance 


var(Y(.3))  =  (1-pjjy)  var (Y)  ,  (2.1.6) 


where  pyx  is  the  correlation  coefficient  between  Y  and  X.  Following  Porta 
Nova  (1985),  we  obtain  an  unbiased  point  estimator  of  pY  by  averaging  the 
controlled  observations 


Yi{0)  =  Yi-d[Xi  -  PX\  *  -  h 


(2.1.7) 


to  form 

Y{i3)  =  E  Y,(3)/K, 

i-i 


(2.1.8) 


where  K  is  the  sample  size.  Since  we  do  not  know  the  optimal  value  3,  we 
must  estimate  it.  An  intuitive  estimate  of  3  replaces  the  right-hand  side  of 
(2.1.5)  with  the  appropriate  sample  quantities.  This  solution  turns  out  to  be 
the  least  squares  solution  for  3.  When  the  assumption  of  joint  normality 
between  Y  and  X  is  made,  then  the  least  squares  solution  is  also  the 
maximum  likelihood  solution.  We  estimate  3  by 


v(Y,-y)(x;-x) 


M-*)2 


(2.1.9) 


and  the  point  estimator  of  /Xy  is  then 


EW)/*. 


(2.1.10) 


We  obtain  an  interval  estimate  for  /Xy  by  application  of  regression  theory. 
First  we  make  note  of  what  happens  under  the  assumption  of  joint  normality 
for  Y  and  X.  In  this  situation  the  conditional  distribution  of  Y  given  X  is  also 
normal: 


Y  |  X=x  ~  N{vy  +  3(x  -  /xx),cxe2) 


(2.1.11) 


Of  —  <7y(l  —  Pyy) 


(2.1.12) 
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and 


oY  =  var(Y ) 


(2.1.13) 


We  see  that  if  X=x,  there  is  a  linear  regression  of  Y  on  X.  Given  we  know 
values  of  the  control  variable  X,  as  well  as  its  mean,  we  see  that  the 
conditional  mean  of  Y  has  two  terms.  The  first  term  is  \iY  >  the  parameter  to 
be  estimated.  The  second  term  is  a  correction  due  to  the  particular  values  of 
the  control.  To  get  at  /iy,  we  will  subtract  out  these  corrections  as  in 
(2.1.7).  Equation  (2.1.11)  shows  us  that  each  observed  Y,  has  the  form 

Y,  =  My+/3(-Xi  —  Hx)  d"  et  1  <  i  <  K,  (2.1.14) 


where  e,  are  the  residuals 

e,  ~  iV(0,<7(2)  (2.1.15) 


There  are  two  unknown  quantities  in  (14),  so  we  can  apply  the  method  of 
least  squares  to  solving  for  nY  and  13.  The  parameter  fJ-Y  is  the  intercept  of 
equation  (14)  and  under  the  joint  normality  assumption  for  X  and  Y, 


HyW)  ~  N{nY,a?s n), 


(2.1.16) 


where  <sn  is  the  upper  left-hand  corner  entry  of  the  matrix  (D'D)  1  where 


1  Xx—ijx 

1  X2—f^x 
1 


D  = 


(2.1.17) 


1  XK—fJ.x 


Now  to  form  a  confidence  interval  about  /Liy(/?)  we  will  first  need  an  estimate 
of  <r2.  Remembering  that  <72  represents  that  variability  in  Y  given  we  have 
accounted  for  X,  the  formula  for  the  residual  mean  square  error  given  in 
regression  theory  makes  good  intuitive  sense  as  an  estimator  of  cr2,  that  is 

E  Pi  ~  Yt  )2 

**  -  —  0  >  (2.1-18) 


where 

Y^)  =  V-Y0)  +  -Mx).  1  <*<K  (2.1.19) 


Now  it  can  be  shown  (  Hogg  and  Craig  (1970),  pg.  337)  that 


My(/E  —  My 


where  t^_ 2  is  a  Student-t  distribution  with  K-2  degrees  of  freedom.  From 
regression  theory  sn  is  given  by  (  Draper  and  Smith  (1981),  pg.  83  and  some 


algebra  because  our  A"  is  X ,  —  Mx  ) 

-mx)2 

«11  =  - 1“>  (2.1.21) 

K  -  JO2 


where 


K 

(2.1.22) 


A  100(1-Q;)%  confidence  interval  is  given  by 

^y{0)  (2.1.23) 


As  mentioned  in  the  introduction  we  expect  to  incur  a  loss  due  to  the 
estimation  of  0.  In  this  case  where  we  only  have  a  single  control  variable,  we 
expect  that  the  realized  variance  reduction  should,  on  the  average,  decrease 
as  sample  size  decreases.  This  loss  is  quantified  via  the  loss  factor. 
Following  Lavenberg,  Moeller  and  Welch  (1982),  we  define  the  loss  factor  as 
the  ratio  of  the  variance  of  the  estimator  of  fj.Y  when  the  optimal  control 
coefficient  is  not  known  to  the  the  variance  of  the  estimator  when  the 
coefficient  is  known.  So 


Later  we  will  change  the  abbreviations  to  more  standard  Greek  svmbois. 


2.2  Univariate  Simulation  Response  with  Multiple  Controls 

The  previous  discussion  can  be  extended  to  the  case  of  multiple  controls. 
We  summarize  the  development  presented  by  Lavenberg  and  Welch  (1981) 
for  simulation  output  analysis  based  on  independent  replications,  batch 
means,  and  regenerative  analysis. 

2.2.1  Output  Analysis  Using  Independent  Replications  or  Batch 
Means 

During  the  course  of  making  simulation  runs,  we  observe  the  values  of 
the  response  of  interest  as  well  as  the  Q  control  variables.  Separate 

observations  could  occur  as  the  result  of  independent  replications  of  the 

simulation  model.  These  observations  could  also  result  from  the  use  of 
batching  to  form  nearly  independent  observations.  Let  X  be  a  Qxl  vector  of 
controls,  i.e.,  X  =  (Xlt  ■  .  .  ,  Xq  )'  with  known  mean  vector 

/'x  —  {(l\>  ■  ■  ■  <  t^Q  )'  and  let  B  =  (b  x>  .  .  .  ,  bq)  be  a  IxQ  row  vector  of 

constants,  then  the  controlled  estimator  of  f-iy  becomes 

Y(B)  =Y  -  B(X- nx)  .  (2.2.1) 

The  vector  0  which  minimizes  Var(Y(B))  is  given  by 

0  —  oyxExx.  (2.2.2) 

where  Yxx  is  the  Q  xQ  covariance  matrix  of  the  controls  and  rrYx  is  the  IxQ 
vector  of  covariances  between  the  response  and  the  controls.  See  Anderson 


(198-4),  pg.  39,  for  a  proof.  The  resulting  minimum  variance  is 


Var(y  (.->’))  =(  1  -  /»y’x)  var(Y'), 


(2.2.3) 


where  pyx  ls  coefficient  of  multiple  correlation  between  Y  and  X.  The 
authors  next  comment  on  the  availability  and  choice  of  control  variables. 
They  cite  many  application  papers  and  distinguish  between  external  and 
concomitant  controls. 

The  previous  discussion  hinged  on  the  assumption  that  3  is  known.  This, 
of  course,  is  not  the  case  in  practice  (otherwise  there  would  be  little  need  to 
simulate  a  process  in  the  first  place).  We  must  estimate  3  and  incorporate 
the  estimate  into  an  effective  statistical  procedure  to  estimate  .  To  obtain 
an  unbiased  estimator  of  pY ,  we  make  K  independent  replications  of  the 
model  or  we  organize  the  output  from  one  run  into  K  batches  so  that  means 
computed  from  each  batch  are  approximately  independent.  If  X*.  is  the 
vector  of  controls,  observed  on  the  ktfl  replication  or  batch,  then  we  compute 

Yk(B)  =  Y  -  B(Xk  -  nk)  ,  k  =  1,  ...  ,K.  (2.2.4) 


A  sensible  estimator  from  the  entire  data  set  would  be 

-  T  snw . 

K  k- 1 


(2.2.5) 


Now 


var (Y(B))  =  var (Y(B))  . 


(2.2.6) 


However,  we  still  do  not  know  the  optimal  value  of  B,  and  we  must  estimate 
it.  One  estimate  of  3  is 


,>  _  fj  v-i 


(2.2.7) 


where  Yyx  and  <7yx  are  the  sample  analogs  of  ~xx  and  <7yx.  We  now 
substitute  3  for  B  in  (2.2.4)  and  (2.2.5)  to  produce  the  estimates  Y k{3)  and 
Y(3).  In  general  Y[3)  is  not  an  unbiased  estimator  because  3  and  X  are  not 
in  general  independent.  A  simplfying  assumption  is  that  (Y ,  X)  are  jointly 
distributed  as  multivariate  normal  random  variables,  see  Cheng  (1978).  This 
assumption  may  be  justified  by  the  use  of  sample  means  as  controls  (as  well 
as  the  estimator  Y).  In  this  case  Y{3)  is  an  unbiased  estimator  of  fj.Y ,  and 
using  regression  theory  (discussed  in  greater  detail  later)  we  obtain  an 
estimator  of  var(Y(  J))  such  that 


Y{3)—^y 

var (F(J)) 


(2.2.8) 


where  tK_Q_{  is  a  t-distributed  random  variable  with  K-Q-l  degrees  of 
freedom.  Q  is  the  number  of  controls.  Explicitly 

var(f(i))  =  <7,2su, 


where  <~t J  i3  given  by  (2.2.38)  and  su  is  described  in  (2.1.17).  This  leads  to 
the  familiar  100(1—  a)%  confidence  interval  for  fJ.Y 


Y{J)  ±  Ik-q- i(1-'1/2)VsItT 


(2.2.9) 


j  be  procedure  used  to  construct  the  interval  given  in  (2.2.9)  is  discussed  in 
greater  detail  in  a  subsequent  section. 

Since  .?  is  estimated  by  one  would  expect  that  some  loss  in  variance 
reduction  would  be  incurred.  We  define  the  loss  factor  to  be  the  ratio  of  the 
variance  of  the  controlled  estimator  when  the  controls  are  unknown  (hence 
must  be  estimated)  to  the  variance  of  the  controlled  estimator  when  the 
controls  are  known.  L'sing  the  notation  of  Venkatraman  and  Wilson  (1986), 
we  let  Xj  denote  the  loss  factor,  we  will  show 


K- 2 
K-Q  -2 


(2.2.10) 


This  factor  is  derived  from  the  following  considerations.  If  we  do  not  use 
controls  then  bv  direct  estimation 


var(Y)  = 


(2.2.11) 


where  <7y  is  the  variance  of  Y.  Now  if  we  know  the  control  coefficients,  then 
from  (2.2.3) 


var(Y(7))  =  (1-Pyx)' 


(2.2.12) 


The  ratio  of  (2.2.12)  and  (2.2.11)  is  called  the  minimum  variance  ratio  (rq' 


and  we  see  that  100(1-^)  is  the  percentage  variance  reduction  achievable 
when  B  is  known. 


When  we  estimate  3  with  3  we  are  now  interested  in 


var(T) 


(2.2.14) 


r/j  is  called  the  variance  ratio.  We  note 


van  i  J 


var(y(J))  var(y) 


=  Vh  - 


(2.2.15) 


where  Xj  is  the  loss  factor  due  to  the  estimation  of  B. 

At  this  point  we  have  all  the  pieces  save  an  expression  for  var(y(3)). 
The  following  details  are  from  Lavenberg,  Moeller,  and  Welch  (1982).  First 
we  remember  (2.2.5) 


y(3)  -y  -  ^(x-mx). 


(2.2.16) 


To  get  var(y(J))  the  technique  will  be  as  follows.  First  we  will  write 

_  A 

var(y(3))  as  a  linear  combination  of  the  Yk,  then  we  will  fix  the  controls, 
compute  the  conditional  variance,  and  finally,  exploit  the  conditional 
unbiasedness  of  Y(3)  by  computing  the  variance  of  Y(3)  as  the  expected 
value  (  with  respect  to  the  controls)  of  the  conditional  variance. 


Define  M  as  a  QxK  matrix  such  that 


1 


m 


In— xi  xik~ x\ 


(2.2.17) 


XQ\~XQ  XQK  XQ 


where  x,^  is  the  value  of  the  ith  control  on  the  jt>l  replication  or  batch,  also 


xt  is  the  sample  mean  of  the  control.  From  (2.2.2)  we  can  write  B  as 


3  =  aYx±£  =  (r-yi^)'M'(MM')-1 


where  is  a  column  vector  of  Is.  Now  we  can  write 


where 


Y{3)  =  b'Y  , 


6'  =  —  IV  —  (X  —  mx)'(MM')-1M 


(2.2.18) 


(2.2.19) 


(2.2.20) 


Given  X*  =  xk  for  k  = 1,  ...  ,  K,  then  6 1  is  a  constant  vector.  Now  we  have 
the  conditional  estimator  in  terms  of  the  Y  k.  We  compute 

var(Y(J)  |  Xk—xk  for  all  k)  —  b'  (varF  |  Xk=xk  for  all  k)~^b  ,  (2.2.21) 


which  reduces  to 


•ar (F(i)  j  Xfc=ifc  for  all  k)  =  cr;  -77+(X-^x)'(\LM/)~I(X-Mx)  , 

A 


(2.2.22) 


where  a,  is  the  residual  variance  (as  described  in  (2.2.34)).  Now  we  find  the 
expected  value  of  (2.2.22)  with  respect  to  the  controls,  and  we  get 


-  -  l  a'2  [  -  i  - 

r  (Y(3))=EX  —  [l-f^X-^MM'rVx-Mx)]  .  (2.2.23) 


Now  since 


K- 1  ’ 


(2.2.24) 


we  have 


var (Y(3))  =  EX  ~  1+-^-(X-mx)'2x1(X-Mx) 


(2.2.25) 


We  note  that  (  Anderson  (1984)) 


T2  =  K{X-nx)til(X.-fMX) 


(2.2.26) 


is  Hotelling’s  T 2  statistic.  Also  Corollary  5.2.1  of  Anderson  (1984)  gives 


AT(X—  Vx)'— x  (X— fj-x) 

K- 1 


F 

Q  ^fQ.k-Q  • 


Kenney  and  Keeping  (1951)  give 


E{fQ,K-Q  ) 


K-Q 
K-Q-2  ‘ 


Now  (2.2.25)  becomes 


K 


K- 1 


■S(T2) 


K 


1  + 


AT-Q 


E[fq,k-q) 


which  finally  reduces  to 


var(F(5))  =  ^ 


R. 


K-Q  -2 


Examination  of  (2.2.14)  reveals  that  \  ,  the  loss  factor,  is 


x  =  iHimn 

1  _  var (Y(3))  ’ 


so  from  (2.2.12)  and  (2.2.30) 


X 


i 


K—2 

K-Q-2 


Now  that  we  have  a  theory  which  lets  us  develop  confidence  intervals  and 
quantifies  the  loss  incurred  due  to  the  estimation  of  3,  we  need  a  statistically 
valid  procedure  to  construct  the  intervals.  Lavenberg  et  al.  develop 
procedures  based  on  the  method  of  independent  replications  as  well  as  the 
regenerative  method.  Here  we  summarize  only  the  method  of  independent 
replications,  and  in  a  later  section  we  discuss  procedures  for  the  regenerative 
method. 


Now,  Y ,  X  =  (xj  Xq  )'  are  assumed  to  be  jointly  distributed  as  a 
multivariate  normal.  Conditional  on  X  =  x,  Y  will  be  distributed  as 
univariate  normal  with 


I  X  -  x)  -  +i{X— Hx)  , 


(2.2.33) 


where  3  is  given  by  (2.2.2)  (  the  optimal  control  coefficient  vector).  The 


variance  is  given  by 


var(Y  |  X  =  x)  =  Oy(l-pyx)  . 


(2.2.34) 


So  if  we  take  the  X  as  fixed  we  have  the  linear  regression  problem  with 


1  xn~Vx,  -  i-M 


x0  9 

Ji 


Yk  1  x\K~Llz  xOK~ll  in  3 


(2.2.35) 


where  xtJ  is  as  given  in  (2.2.17)  and  nz  is  the  is  the  known  mean  of  the  ith 
control.  It  becomes  apparent  that  we  will  be  estimating  f.iy  with  the  least 
squares  estimate  for  the  intercept  of  (2.2.35).  We  form  our  confidence 
interval  in  the  standard  manner. 

Let  Hy  and  3  be  the  corresponding  estimators  of  fly  and  3  and  let  D 
denote  the  Kx(Q+ 1)  matrix  on  the  right  hand  side  of  (2.2.35).  From 
regression  theory  the  conditional  distribution  of  3  given  D  is 

3  -  Nq^{3,  af(D'D)_1)  .  (2.2.36) 


Hence 


My  "•  *M My  >  f7‘su)  > 


(2.2.37) 


where  su  is  the  upper  left  most  corner  of  (D'D)  l.  Now  all  that  is  required  is 
an  estimate  of  the  common  variance  cr2.  Such  an  estimate  is  given  by 

(En2-£  (My +•}(** -Z^z))2) 

~  n  fc  — 1  k  =  1 
<7~  =  - 

K-Q- 1 

So  given  the  observed  values  of  the  control  variables  a  100(1 —a)  % 
confidence  interval  for  (j. Y  is  given  by 


(2.2.38) 


Mr  ±  tK-Q- 1(1— 'i/2)\/^u  rr.  • 


(2.2.39) 
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2.2.2  Output  Analysis  Using  the  Regenerative  Method 


Lavenberg  and  Welch  (1981)  summarize  a  methodology  for  the 
construction  of  confidence  intervals  based  on  the  regenerative  method.  A 
more  detailed  development  is  found  in  Lavenberg,  Moeller  and  Sauer  (1979) 
as  well  as  in  Iglehart  and  Lewis  (1979). 

The  regenerative  method  can  be  based  on  a  single  run  of  the  model. 
The  method  may  be  applied  if  there  exists  an  increasing  sequence  of  random 
times  that  partition  a  run  into  independent  and  identically  distributed  cycles. 
This  sequence  of  regeneration  times  typically  correspond  to  some 
distinguished  state  of  the  model.  This  state  is  such  that,  when  it  is  entered, 
the  model  starts  afresh  according  to  the  same  probabilistic  mechanism  that 
drove  the  previous  cycles.  Lavenberg  and  Welch  (1981)  point  out  that,  in 
complex  simulations,  regeneration  points  may  occur  so  infrequently  as  to 
discourage  the  use  of  this  method.  The  construction  of  confidence  intervals 
using  the  regenerative  method  is  discussed  in  detail  in  Crane  and  Lemoine 
(1977),  Iglehart  (1978)  and  Welch  (1983). 

We  follow  Iglehart  and  Lewis  (1979)  in  their  application  of  the 
regenerative  method  using  control  variables.  Assume  we  observe  in  a  run  of 
n  cycles: 

{Yj,  Tv  Xy)  :  1  <  j  <n  , 


where  r;  is  the  length  of  the  jth  cycle,  Y  j  is  some  response  of  interest  and  X; 
is  a  Q  x  1  vector  of  controls.  Many  steady-state  parameters  of  interest  can  be 
expressed  as  the  ratio  of  two  expected  values.  Let  r  be  such  a  variable.  A 


controlled”  point  estimator  of 


_  E\Y\ 
r  E\r) 


is  given  by 


(2.2.40) 


where  b  is  a  1  xQ  row  vector  of  control  coefficients  and  Y ,  X  and  7  are  the 
sample  means  of  Y ,  X  and  r.  We  note  that  controls  are  only  being  applied 
to  the  numerator  of  the  estimator.  This  type  of  estimator  is  called  a  top- 
controlled  estimator.  Eakle  (1982)  developed  a  two-stage  procedure  which 
first  applied  controls  to  the  denominator  of  (2.2.40)  to  reduce  the  bias  of  r 
and  then  applied  another  set  of  control  variables  to  the  numerator  to  reduce 
the  variance  of  the  estimator.  We  discuss  only  top-controlled  estimators. 


To  obtain  an  interval  estimator  for  r,  it  can  be  shown  that 


\fn  (r  (6  )  —  r )  D 


c{b)/  T 


n— *■  oo 


•JV(0, 1), 


where  - ►  denotes  convergence  in  distribution  and 

n — >cc 

cr2(6)  =  var (Y j  —  rr;  —  6X;).  If  we  replace  a{b)  by  an  asymptotically 
consistent  estimator  s(b)  then  the  same  convergence  applies  and  we  can 
construct  confidence  intervals.  However,  we  typically  do  not  know  the 
optimal  values  of  b.  The  optimal  value  of  b  is  f3  and  is  given  by 
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3  —  p(  1  —  rr,  X)Lxx) 


(2.2.41) 


■where  ^xx  1S  the  Q  xQ  covariance  matrix  of  controls  and  p(Y  —  rr,  X)  is  the 
lxQ  row  vector  of  covariances  between  Y  —  rr  and  X.  The  vector  3 
minimizes  the  variance  of  r.  An  estimate  of  3  ,  3  is  obtained  by  using  the 
appropriate  sample  covariances  in  (2.2.41).  The  variance  of  r  can  be 
estimated  using 

M  r)  =  s2(3)/(jV^)2, 

where 

*2(h  -  -r-r S  \r,  - h  W  - X)\. 

n  i  y_j  L 


An  asymptotically  correct  100(1—  a)%  confidence  interval  is  given  by 


r  ± 


Z0/2s{3) 

\fa  ~ 


Lavenberg,  Moeller  and  Sauer  (1979)  describe  a  specialized  set  of  ratio-type 
controls  for  regenerative  estimators. 

2.2.3  Analysis  Techniques  for  Nonnormal  Responses 

Lavenberg,  Moeller  and  Welch  (1982)  present  a  method  of  producing 
confidence  intervals  based  on  the  jackknife  statistic,  and  they  apply  this 
method  in  a  Monte  Carlo  study  of  a  broad  class  of  closed  queueing  networks. 


_  /V 

Let  be  the  estimator  computed  (2.2.16)  using  the  methodology  of 

Lavenberg  and  Welch  (1981)  when  the  kth  observation  has  been  deleted. 


Compute  the  “pseudovalues” 

Jk=  K  Y{3)  -  [K—l)Y (jt)(^),  1  <  k  <  K  . 


Now  calculate  the  jackknife  statistic 


m  - 


K 


K 

Zh 

Jt— i 


(*) 


and  the  sample  variance 


tttEWPI-# 

K  1  ;-l 


An  asymptotically  valid  confidence  interval  is  given  by 


J{3)  ±  tK_x [l—a)Sj0)/\/~K  . 


We  are  referred  to  Arvensen  (1969)  for  proof.  These  intervals 
mild  regularity  conditions  given  in  Miller  (1974). 


(2.2.42) 


(2.2.43) 


(2.2.44) 


(2.2.45) 


hold  under 


2.2.4  Experimental  Results 


Lavenberg,  Moeller  and  Welch  (1982)  apply  the  control-variate 
confidence  intervals  (2.2.39)  and  (2.2.45)  across  a  general  class  of  closed 
queueing  networks.  They  develop  three  types  of  controls,  service  time 
variables,  flow  variables,  and  work  variables.  The  networks  considered  take 
the  follo%ving  form.  Consider  a  finite  set  (say  of  size  S)  of  interconnected 
service  centers.  These  centers  service  D  different  types  of  customers.  There 
are  a  total  of  N  customers  of  all  types.  Assume 

1.  Markovian  Routing  so  that  the  next  station  visited  only  depends  on  the 
current  location. 

2.  The  service  times  for  the  the  jth  type  of  customer  at  the  itk  service 
station  are  drawn  independently  from  identical  populations  with  finite 
mean  and  variance. 

3.  Service  time  sequences  and  sequences  of  centers  visited  are  mutually 
independent. 

The  above  networks  form  a  general  class  of  closed  queueing  networks. 
Since  the  only  random  components  of  this  system  are  derived  from  the 
service  time  distributions  and  the  multinomial  routing  distributions,  functions 
of  these  variables  can  be  used  as  internal  controls. 

The  authors  perform  a  rather  extensive  study  across  many  different 
networks  of  the  type  described  above.  Three  response  variables  were  studied 
separately:  the  long  run  average  waiting  time  (by  customer  type),  the  long 
run  rate  at  which  departures  occur  (by  customer  type),  and  the  long  run 
average  response  time  (by  customer  type).  The  following  are  important 


conclusions  of  their  work: 


1.  Work  variables  exhibit  the  smallest  minimum  variance  ratios.  (The 
authors  expected  this  since  these  variables  contain  both  information  on 
service  time  and  flow). 

2.  The  loss  factor  derived  in  Lavenberg  and  Welch  (1978)  appeared  to 
adjust  the  minimum  variance  ratio  correctly. 

3.  The  actual  coverage  probability  for  nominal  100(1— a)%  confidence 
intervals  did  not  suffer  with  the  application  of  controls. 

4.  The  regression  method  produced  substantially  smaller  confidence 
intervals  than  the  jackknife  method  (with  no  appreciable  degradation  in 
coverage). 

5.  The  forward  selection  procedure  (Draper  and  Smith  (1981))  was  used  to 
cope  with  the  control  variable  selection  problem. 

Wilson  and  Pritsker  (1984a, b)  offer  theoretical  and  experimental  results 
on  what  they  call  “standardized”  concomitant  variables.  Assume  we  are 
dealing  with  a  Q-station  queueing  network.  Define  the  input  processes  as 
{{U }(k)  :  j  >  1)}  ,  1  <  k<  Q.  Control  variables  will  necessarily  be  functions 
of  these  inputs.  Consider  a  control  of  the  form 

Xk(t)  =  (1  /a{k,t))  asVi(*)-M*)  ,  (2.2.46) 

j-i 

where  a(k,t )  =  number  of  service  times  that  are  started  at  station  k  during 
the  time  period  0,ti.  Also  is  the  known  mean  of  the  kth  control.  Wilson 


(1982)  showed  that  controls  of  the  type  given  by  (52)  have  asymptotic  mean 
and  variance  equal  to  zero.  He  states  that  this  result  also  applies  to  the 
“work”  variables  given  by  Lavenberg,  Moeller,  and  Welch  (1982).  As  a 
consequence  of  this  fact,  the  covariance  matrix  of  the  controls  becomes 
asymptotically  singular.  The  authors  offer  remedy  in  the  form  of 
standardized  controls.  Consider  controls  of  the  following  form: 

Xk{t)  =  (a(M))~1/2  E  \uj(k)  -  Hk)/°k  •  (2.2.47) 

t-i 

Here  <7k  is  the  known  standard  deviation  of  the  kth  control.  The  vector  of 
standardized  controls  is  shown  to  converge  to  a  multivariate  normal 
distribution  with  zero  mean  vector  and  identity  covariance  matrix,  as  the  run 
length  goes  to  infinity. 

One  may  standardize  the  "work"  variables  given  in  Lavenberg, 
Moeller, and  Welch  (1982)  by  defining  the  controls  as 

Xk(t )  =  (V/(*,t)M(/(0))^  Vy(fc)  -  ,  (2.2.48) 

where  uk  =  relative  frequency  with  which  a  customer  visits  station  k  and 
f{k,t)  =  number  of  service  times  that  are  finished  at  station  k  during  time 
period  (0,t). 

Wilson  and  Pritsker  develop  a  theory  of  controlled  replication  analysis 
which  is  based  on  the  asymptotic  multivariate  normality  assumption  and 
present  selected  simulation  results.  A  more  thorough  experimental  treatment 
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is  given  in  Wilson  and  Pritsker  (1984b).  The  variance  reductions  observed  in 
both  papers  certainly  offer  compelling  evidence  that  further  research  in  these 
areas  may  prove  extremely  fruitful.  Experiments  were  carried  out  for 
controlled  replication  and  controlled  regeneration  analysis.  The  systems 
studied  were  a  class  of  closed  and  mixed  queues  representing  machine-repair 
systems.  In  the  controlled  replication  experiments,  variance  reductions  in  the 
range  from  20%  to  90%  were  observed  with  confidence  interval  reductions 
ranging  from  10%  to  70%.  After  the  effects  of  initialization  bias  were 
removed,  no  significant  loss  in  coverage  was  observed.  In  the  controlled 
regenerative  experiments,  variance  reductions  in  the  30%  to  90%  range  were 
observed  with  confidence  interval  reductions  of  20%  to  65%.  Some  coverage 
difficulties  were  noted  (probably  due  to  the  inherent  bias  of  the  regenerative 
estimator)  but  degradation  seemed  to  stay  within  about  10%  of  nominal. 

2.3  Univariate  Simulation  Metamodel  with  Multiple  Controls 

In  all  the  papers  reviewed  to  this  point,  we  have  been  working  with  a 
single  underlying  population  and  sampling  from  it.  If  we  allow  the  population 
to  vary,  say  over  the  design  points  of  an  experimental  design,  then  we  are 
working  in  the  multipopulation  domain.  Nozari,  Arnold  and  Pegden  (1984) 
discuss  the  application  of  control  variables  for  multipopulation  experiments. 
These  experiments  typically  involve  some  form  of  a  general  linear  model. 
This  model  represents  a  simplification  of  the  simulation  and.  as  such,  it  is 
often  called  a  metamodel.  One  object  of  multipopulation  experiments  is  to 
find  a  metamodel  which  can  closely  predict  a  response  across  the  domain  of 
factor  levels.  Factor  levels  here  correspond  to  to  the  design  points  mentioned 
earlier.  Another  object  of  such  experiments  is  to  identify  as  closely  as 


possible  the  values  of  the  metamodel  coefficients.  Such  information  may  tell 
us  the  relative  sensitivity  of  a  response  to  a  particular  factor.  We  also  may 


glean  whether  or  not  certain  factors  should  be  included  in  the  metamodel. 
This  latter  objective  is  the  thrust  of  the  research  put  forward  by  Nozari  et 
al. 


Let  Y  =  (Yt  Y %)'  be  a  Kx  1  vector  of  independent  observations. 

Each  Yt  is  obtained  from  an  independent  run  of  the  model  and  each  has 
common  variance  o2.  Assume 

Y  -  NK{Z3,<rIK), 


where  Z  is  a  Kxm  known  matrix  of  rank  m,  3  is  a  mxl  vector  of  unknown 
coefficients  and  IK  is  the  KxK  identity  matrix.  The  factors  or  functions 
thereof  are  embodied  in  Z,  when  the  factor  levels  are  not  random  the  matrix 
Z  is  commonly  called  the  design  matrix.  Let  Yj  be  a  Qxl  vector  of  controls 
for  the  ith  observation  of  Y.  Assume  X,  has  a  known  mean  vector.  Without 
loss  of  generality,  we  assume  E[Xx)= 0.  Finally  assume 


JVi 


0s  S™ 

~XY  ^XX 


(2.3.1) 


where  E{YX)  =  /i,  and  E yx  and  E xx  are  covariance  matrices  of  the  response 
and  the  controls  and  the  controls  with  themselves,  respectively.  Let  E 
denote  the  (Q+l)  x  (Q-t-1)  covariance  matrix  in  (2.3.1).  Assuming  the 
metamodel  has  been  correctly  specified,  Nozari  et  al.  derive  expressions  for 


AS 


J 


Scheffe'  type  simultaneous  confidence  intervals  for  both  the  case  when  h,  is 
known  and  the  case  when  h.  is  unknown.  In  practice  N  is  unknown  and  must 
be  estimated.  In  this  case,  let  A  be  a  (m—k)-<m  matrix  of  rank  m-h.  Let 


g  —  [Z  x) ,  x  =  (x„  xKy, 


where  G  is  Kx(m+Q )  of  rank  (m+Q).  Now  for  simultaneous  confidence 
intervals 


P  r  1  v  'A  J  €  v  'A  3  rv  ± 


7  G'G  A'v 


V  m  h  f  =  1 — Ct, 


where 


3cv={Im  0)(G'G)-lG'Y 


-2  =  11Y  -  GfG'Q-^^Yll2 

K—m+Q 


and  F(nc)  is  the  alpha  percentage  point  of  a  F  distributed  random  variable 
also  (7m  O)  is  mx(m+Q)  with  Im  the  mxm  identity  matrix. 


I 
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Nozari  et  al.  define  efficiency  as 


var(J)  —  var(JCy) 


o  K  —  m  —  l 
K—m  —Q  —  l 


(z'zy\ 


where  3  are  the  estimated  coefficients  when  controls  are  not  used  and 
~  =  cr,2(l  —  Pyx).  In  similar  fashion  to  Lavenberg  et  al.  (1982),  the  loss 
factor  here  is 


K—m  —1 
K—m  —Q  —1 


2.4  Multiresponse  Simulation  with  Multiple  Controls 

Rubinstein  and  Marcus  (1985)  extend  the  development  of  Lavenberg, 
Moeller,  and  Welch  (1982)  to  the  estimation  of  a  multivariate  mean  response 
using  multiple  controls.  Extending  to  p  responses,  we  see  that  the  controlled 
estimator  becomes 


Y(B)  =  Y  -  B(X-  px)  , 


(2.4.1) 


where  Y  is  a  pxl  vector  of  responses,  B  is  the  pxq  matrix  of  control 
coefficients,  and  X  is  a  qxl  vector  of  controls  with  mean  vector  pLx. 
Rubinstein  and  Marcus  demonstrate  that  det(cov(Y(B))  =  |  cov(Y(£?))  | ,  the 
generalized  variance  of  Y(B),  is  minimized  by 

3  =  — yx—xx  .  (2.4.2) 


VVYv'.vv.v 

L7  mA  m 


-yx  =  ^((Y— /Jv)(X— //x) ) 


-xx  —  £((X— //x)(X— Mx)0 


(2.4.4) 


The  resulting  minimum  generalized  variance  is  given  by 


covY(J)|  = 


V  _  V  V  - 1  V  I 
— YY  — YX—XX-XY  I 


—  I“YyI  fl(l  —  Pi2).  (2.4.5) 


where  u  =  rank(Hyx)  and  pf ,  t  =  1,  ...  are  the  canonical  correlations 
between  Y  and  X  that  satisfy  px  >  ...  >  pL,  . 


The  authors  define  efificiency  of  control  variables  as 


var(Y)| 


(2.4.6) 


Porta  Nova  (1985)  points  out  that  the  use  of  the  term  efficiency  might  not  be 
warranted  here,  because  one  seeks  an  increase  in  an  efficiency  measure, 
whereas  we  seek  to  decrease  5X  .  In  any  case,  5?  measures  the  relationship 
between  Y  and  X.  One  can  see  by  examination  of  (2.4.5)  that  the  larger  the 
canonical  correlations,  the  greater  the  reduction  in  the  generalized  variance. 


Now  given 


,,  i  S'  v 

/'Y  — YY  — YX 

//•V  ’  ^  V*  » 

''X  — XY  —XX 


(2.4.7) 


we  take  a  random  sample  of  k  observations 


Z|  -  ^  .  i  -  1.  ...|  K. 


(2.4.8) 


The  authors  consider  two  cases  : 


1.  The  matrix  3  is  known. 


2.  The  matrix  3  is  unknown  and  must  be  estimated.  Case  (1)  can  be  used 
to  derive  the  multiresponse  analogue  rj2  of  the  minimum  variance  ratio 
r?!  (see  equation  (2.2.13)),  while  case  (2)  is  used  to  derive  the 
multiresponse  analogue  \2  of  the  loss  factor  Xj  (see  equation  (2.2.10)). 
Both  then  are  to  be  brought  together  to  form  what  the  authors  call 
efficiency  and  Venkatraman  and  Wilson  (1986)  call  the  variance  ratio 
(^)- 

Examination  of  (2.4.5)  and  (2.4.6)  show’s  that  the  minimum  variance 
ratio  is  given  by 


-  %  =  n(i  ~  p^) « 


(2.4.9) 


where  u  and  ft  are  as  in  (2.4.5).  Rubinstein  and  Marcus  point  out  that  r/2 


can  be  measured  by  the  ratio  of  the  squared  volume  Yf  of  the  confidence 
ellipsoid  formed  about  the  estimate  of  >1  Y  using  control  variables  to  the 
squared  volume  \  7  of  the  ellipsoid  formed  by  direct  simulation.  Specifically, 
(2.4.7)  implies  that 

K{Y— 7/y)'— ■ yy(Y— ^y)~Yp‘>  (2.4.10) 


where  \p“  is  a  chi-squared  random  variable  with  p  degrees  of  freedom. 

Hence  we  can  form  a  100(1— a)9c  confidence  ellipsoid  for  fj. Y  from 

Pr(Ar(Y-/iY)'-YY(Y-pY)  <  \p  !_0 )  =  1-ar  ,  (2.4.11) 


where 


Y 


1  K 
K 

A  j-i 


(2.4.12) 


The  volume  of  this  ellipsoid  is  given  by 

vi  =  P-1C(p)I-yy|1/2(xp2,i-o/^)p/2  ,  (2.4.13) 


where 


c(P)  =  2-'«/r(p/2). 


We  can  also  form  an  ellipsoid  based  on  the  controlled  estimator  from 


Pr{K{Y-^Y)‘ -v(,)(Y— /<Y)  <  \p,i-n)  =  l~(- 


(2.4.14) 


The  volume  of  this  ellipsoid  is 


(2.4.15) 


We  note  that  the  squared  ratio  of  (2.4.13)  to  (2.4.15)  produces 


si  —  7h  = 


(2.4.16) 


which  verifies  (2.4.9). 


In  practice  Hz  (the  covariance  matrix  for  Z  given  in  (2.4.7))  is  unknown 
and  it  must  be  estimated.  Let  Sz  denote  the  sample  covariance  matrix 


S7  = 


Sr y  SYx 
Sxr  Sxx 


(2.4.17) 


where,  for  example, 


5yy  — 


—  V(Y;-Y)(X;-X)', 
n  1  j'-i 


(2.4.18) 


where  Y  is  given  by  (66)  and  X  is  constructed  analogously.  Now  we  estimate 
3  by  3=SyxSxx  and  form  the  K  controlled  responses  as 


Y>(3)=Yy-.3(X>-MX),  1  <J<K. 


(2.4.19) 


Under  our  assumption  of  multivariate  normality  an  unbiased  estimator  of  /iY 


W)  =  T7  £  Y,(  J)  =  Y-J(X-/ix)  . 
K  j-l 


(2.4.20) 


We  can  form  a  100(1—0')%  confidence  interval  from  the  relationship  (Rao 
(1967)) 


Prj(Y(3)-MY)'EyW(Y(3)— <uY)  < 


(d'd){(K-Q-l)p/(K-Q-p)]FPtK_Q_p(l-a)  =  l-o, 


where 


d'  —  1  'k/K  —  (X—  fj.x)'{G'G)  G', 


where  G  is  defined  in  (2.4.25)  and  1 K  is  a  K  dimensional  column  vector  of 
ones.  Also  we  have 


^  K— 1  /  i 

^y|x  =  ^^-(^yy-SyxSxx^xy). 


Rubinstein  and  Marcus  define  efficiency  of  control  variables  as 


E I  I  — yix  1  (d,d)F 


E\(l/KY\Syy 


.•«  WV'-  V-V- **  'AWiV,  V.  ■  Av“, 


<■  .  - 
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this  is  the  ratio  of  the  expected  generalized  sample  variance  of  Y( 3 )  to  the 
expected  generalized  variance  of  Y.  They  prove  that 

e?  =  C  X{K,Q  ,p  )C  2(K  ,Q  ,p ) 

t  =  i 


where  u  =  rank(^l yx)  and 


Cx{K,Q,p)  =  n 


1-1 


(K-Q-i)(K-l)/(K-Q-l)(K-i) 


and 


C2{K,Q,p)  =  l+v 

J-l 


p  .QLQ+  2)  -  (g+2(j-i)) 

[jj{K~Q- 2)  ...  (K-Q-2J) 


we  see  immediately  that  the  loss  factor  for  this  measure  of  efficiency  is 

c,{k,q,p)  c2(k,q,p). 


Venkatraman  and  Wilson  derive  a  more  natural  extension  of  the  loss 
factor  of  Lavenberg,  Moeller,  and  Welch  (1982)  by  simply  calculating  the 
“efficiency”  or  variance  ratio  as 

rj2  =  rj2\2  ,  (2.4.21) 


where  rj2  is  the  variance  ratio  given  by 


var(Y) 


K-Q-2 


i  i  ^ 


1  =  1 


r/2  is  the  minimum  variance  ratio  given  by 


-  _  var(Y(.<))  _  -  , 

-  var(Y)  “.'.V1  ft>’ 


(2.4.23) 


and  X2  is  the  loss  factor  given  by 


\2 


varfYf  3))  _  K-  2  |P 
var(Y(^))  K-Q  -2 


(2.4.24) 


To  calculate  X2  we  need  an  expression  for  var(Yf.B)).  One  way  to 

_  A 

calculate  this  is  to  write  Y(,3)  as  a  linear  combination  of  the  uncontrolled 
responses  and  then  calculate  the  covariance  matrix  of  this  vector.  This 
procedure  is  an  extended  variant  of  the  procedure  used  in  Lavenberg, 
Moeller,  and  Welch  (1982). 

Let  G  be  defined  as 


G  = 


(X,-X)' 


xK-xy 


(2.4.25) 
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where  X, ,  1  <  i  <  K  is  a  Q  xl  column  vector  of  observed  controls  and  X  is 
the  Qxl  vector  of  the  sample  control  means. 

Analogous  to  equations  (2.2.18),  (2.2.19),  and  (2.2.20)  we  can  write 

Y{3)  =  Y  -  B(X-/jx)  =  (Ylt...,Yx)H  ,  (2.4.26) 


where  (Y;,..,^)  is  the  matrix  of  observed  uncontrolled  responses  and  H  is 
defined  as 


H  =  jflK  ~  G (G'G)-1(X-/ix) 


(2.4.27) 


where  1 K  is  a  K  dimensional  column  vector  of  ones.  The  authors  show  that 

_  a 

Y (J)  is  an  unbiased  estimator  of  /J\. 

var(Y(jl))  is  calculated  via  the  law  of  total  probability  for  expectations. 
First,  we  calculate  the  conditional  covariance  of  Y{3)  given  the  observed 
controls,  then  we  take  the  expectation  of  this  quantity  across  the  controls. 
This  double  expectation  is  the  correct  quantity  since  the  unconditional  and 
conditional  expectations  of  Y(5)  are  the  same. 

After  both  expectations  are  taken  we  get 

var(Y{.?))-  (2.4.28) 

where 


\  • 
“Y 


_  V 

X  -  -YY 


V 


V  -1  V 
YX-XX-XY 


(2.4.29) 
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cov(Y(~  ))  =  ■—  ^y!x  ■ 


(2.4.30) 


so  the  loss  factor  X2  is 


|  covY(' 


K- 2 
K—Q  —2 


(2.4.31) 


r/2  is  given  by  (2.4.23);  hence  the  generalized  variance  ratio  r?2  is  given  by 


K- 2  2N 

tetri' |  n(w.) 


(2.4.32) 


Obviously  we  require  K—Q—2  >  0. 


Venkatraman  and  Wilson  provide  guidance  for  limiting  the  number  of 
controls.  They  state  that  if  A  is  a  user  specified  upper  limit  on  the  loss  X2, 
then  at  most 


Q  *  =  (iC-2)(l  -  A"1/p) 


controls  should  be  applied. 


2.5  Multiresponse  Simulation  Metamodel  with  Multiple  Controls 

Porta  Nova  (1985)  extends  the  development  of  Nozari,  Arnold  and 
Pegden  (1984)  to  the  case  of  multiresponse  metamodels  with  multiple 
controls. 

The  model  employed  is  as  follows 

Y  =  Z3  +  XA  +  R 


where  Y  is  a  Kxp  matrix  consisting  of  K  p-dimensional  observations,  Z  is  a 
Kxm  design  matrix,  3  is  a  mxp  matrix  of  unknown  parameters,  X  is  a  KxQ 
matrix  consisting  of  K  Q-dimensional  vectors  of  controls,  A  is  a  Q  xp  matrix 
of  unknown  control  coefficients  and  R  is  a  Kxp  matrix  of  residuals. 


Porta  Nova  provides  point  estimators  for  3  and  A 


3=(Z'Z)~1  I—X(X'PX)~lX'P  \Y , 


a  =  (x'px)_1x'pr 


where 


P  =  l-Z{Z'Z)-lZ'. 


A  100(1— :>)%  confidence  ellipsoid  for  vec  3  (vec  3  is  a  column  vector 


obtained  by  concatenating  the  columns  of  3)  is  derived  from 


vec  (.5  -  vec  (.?  -  3)  \X  s  T*p{K-m-Q) 


where  T ^(K—m—Q)  is  Hotelling’s  T 2  with  K—m—Q  degrees  of  freedom, 
here 


B  =  ( Z'Z)~lZ 


I  -  X(X'PX)"1X'P 


and 


Ey,x  =  R'R  /{K—m—Q ) 


where 


R  =Y  - 


+  XA 


and  0  is  the  Kronecker  product.  The  Kronecker  product  of  the  mxn  matrix 
A  with  the  pxq  matrix  B  is  the  mpxnq  matrix 


AUB  •  •  •  AlnB 


A53B  = 


Am\B  '  '  '  AmnB 


I 

A  100(1—0;)%  confidence  ellipsoid  about  vec  3  is  formed  from 


< : V  /■;« 


o-  wNl.nl>. 


Porta  Nova  generalizes  the  minimum  variance  ratio  as 


FT  (x  —  Pi) 


he  also  provides  a  loss  factor  of 


^3 


K—m—l 
K—m—Q — 1 


where  u  —  rankEyx  and  pt  are  the  canonical  correlations  between  Y  and  X. 

2.6  Selection  of  Regression  Models 

Typically  a  practitioner  is  confronted  with  multiple  candidates  for 
controls.  It  is  possible  (in  view  of  the  loss  factor)  that  if  he  elected  to  use  all 
the  candidates,  he  might  actually  induce  variance  into  his  estimator. 
Therefore,  a  control  variate  selection  orocedure  that  finds  the  “best”  subset 
of  controls  is  desirable.  We  have  seen  that  the  principal  technique  employed 
to  exploit  control  variates  is  linear  regression.  In  our  application,  control 
variates  are  used  as  predictors  in  a  linear  regression  on  some  response  of 
interest.  The  statistical  literature  is  replete  with  papers  that  deal  with  the 
selection  of  predictor  variables  in  the  regression  context.  However,  relatively 
little  has  been  written  on  the  selection  of  controls.  In  this  section  we  review 
control  variate  selection  techniques  as  well  as  the  more  general  literature  on 


the  selection  of  predictor  variables  in  linear  regression. 

2.8.1  Review  of  Control  Variate  Selection  Techniques 

Lavenberg,  Moeller  and  Welch  (1981)  applied  a  restricted  variant  of 
“forward  selection”  regression  (Draper  and  Smith  (1981),  Chap.  6)  to  selecting 
a  “best”  set  of  controls.  Nozari,  Arnold  and  Pegden  (1984)  developed 
variants  of  the  “all  regressions”  and  “forward  selection”  procedures.  These 
variants  were  tailor-made  for  the  selection  of  controls  in  the  situation  where 
the  objective  was  the  estimation  of  a  univariate  simulation  metamodel  with 
multiple  controls.  Porta  Nova  (1985)  and  Venkatraman  and  Wilson  (1986) 
offer  advice  on  the  number  of  controls  to  be  used  but  do  not  discuss  how  to 
select  these  controls. 

2.8.2  Review  of  Variable  Selection  Techniques 

There  are  several  good  survey  articles  written  on  the  variable  (predictor) 
selection  problem.  Draper  and  Smith  (1981),  Thompson  (1978)  and  Hocking 
(1976)  provide  detailed  surveys  of  the  variable  selection  problem  itself,  while 
Hocking  (1983)  offers  a  succinct  overview  of  the  general  topic  of  linear 
regression.  Siotani,  Hayakawa  and  Fujikoshi  (1985)  provide  some 
multivariate  extensions  of  selected  methods. 

To  organize  the  discussion  we  will  classify  the  selection  techniques 
according  to  model  and  objective.  Model  will  refer  to  either  multiple  linear 
regression  (univariate  response)  or  multivariate  linear  regression  (multivariate 
response).  The  objective  of  a  model  will  be  classified  as  one  of  the  following: 
prediction,  description  or  control.  Aitkin  (1977)  makes  the  distinction 


between  description  and  prediction.  We  draw  attention  to  these  objectives 
only  to  underscore  our  objective,  control. 


2. 6. 2.1  Multiple  Linear  Regression  Model 


Basic  Model.  The  multiple  linear  regression  model  takes  on  the 
following  form 


y,  =  Xj  J  +  e,  ,  1  <  i  <  K 


where  y,  is  the  itk  independent  observed  response,  X,  is  the  ith  observed 
lx(Q-t-l)  row  vector  of  predictors,  3  is  a  (Q-t-l)xl  column  vector  of  unknown 
parameters  (  with  a  1  in  the  first  column)  and  e,  is  the  ith  residual.  We 
assume  that  E(et)  =  0  and  var(e, )  =  o2,  1  <  i  <K.  We  note  that 

E\y\  =  X3 


and 


var:y;  =  o2, 


if  we  can  take  X  as  fixed.  This  is  never  the  case  when  we  employ  control 
variates.  We  must  be  careful  to  distinguish  between  the  case  when  X  is  fixed 
and  the  case  when  X  is  a  random  matrix.  Following  Thompson  (1978),  we 
impose  the  following  condition  for  X  random.  Assume  y,  A”  =  x1(  ...,  £q  are 
jointly  distributed  as  a  (Q-*T)-dimensional  normal  distribution  with  unknown 
mean  vector  and  covariance  matrix.  The  conditional  expectation  of  y  given 
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var(y  |  X  =  i)  =  ar 


where  3  and  o2  can  be  expressed  in  terms  of  the  mean  vector  and  covariance 
matrix  (see  Anderson  (1984),  Chap.  2).  Now  conditionally  on  X  =  x,  y  is 
normally  distributed  with  mean  and  variance  as  given  above.  Finally, 
conditioned  on  X  =  x,  the  model  is 

y  =  X3-\-e,  (2.6.1) 

where  X  is  lx(Q+l),  3  is  (Q-fl)xl  and  E[e)  —  0  with  var(e)  ==  o2.  We  will 
see  that  certain  expectations  can  be  computed  over  X  in  the  case  of  random 
predictors  that  lead  to  criteria  for  variable  selection. 

It  turns  out  that  the  estimates  used  for  3  and  o2  are  the  same  for  either 
model,  however;  the  distributions  of  the  estimators  differ.  If  we  array  all  the 
K  observations  we  have 

Y  =  X5  +  € 

where  Y  is  Kx  1,  X  is  Kx{Q-\- 1),  3  is  (Q+l)xl  and  e  is  Kx  1.  The  least 
squares  solution  for  3  is 


3  =  (x'xr'x'Y 


An  unbiased  estimator  of  -7"  for  both  methods  is  given  as 


=  - - - (Y  -  Xi)'(Y  -  X3) 

K-o-r  M  ' 


Selection  of  Variables 


Given  we  have  Q  candidates  for  predictors,  we  wish  to  select  a  subset  of 
predictors  that  is  in  some  sense  "best".  Using  the  notation  of  Aitkin  (1974), 
we  assume  that  the  first  p  variables  are  selected  and  the  last  Q  —  p  are 
eliminated.  The  subset  model  is 


Y  =  X,^  +  X,J2  +  e, 


where 


X  =  (Xj,X2), 


and 


3  —  (^1,^2)- 


Here  30  is  included  in  Xj  so  Xj  is  Kx{p+ 1).  If  the  last  Q-p  variables  are  not 
included  in  the  model  we  take  32  =  0. 
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The  methods  we  survey  are  typically  of  the  following  form.  A  criterion 
is  chosen  that  reflects  the  intended  use  of  die  regression  model.  Subsets  are 
evaluated  according  to  this  criterion  and  the  “best”  subset  is  chosen  as  the 
solution.  Most  taxonomies  (Draper  and  Smith  (1981),  Thompson  (1978), 
Hocking  (1976))  are  dominated  by  three  approaches:  1)  All  possible 
regressions,  2)  best  k  subset  regressions  and  3)  sequential  procedures. 

The  all  regressions  approach  entails  a  complete  enumeration  of  all  2^—1 
combinations  of  predictors,  each  evaluated  according  to  a  criterion.  Once 
the  criterion  has  been  calculated  for  all  subsets,  the  information  is  arrayed 
and  a  subjective  choice  is  made  based  on  the  criterion  values  and,  perhaps, 
auxiliary  information.  Clearly,  this  is  a  computation-intensive  method.  This 
method  was  not  a  tractable  procedure  until  the  advent  of  modern  high-speed 
computers. 

Best  k  subset  regressions  and  sequential  procedures  have  been  developed 
in  an  effort  to  avoid  the  examination  of  all  possible  regression  subsets. 
Draper  and  Smith  (1981)  cite  the  branch  and  bound  algorithm  given  by 
Furnival  and  Wilson  (1974)  that  computes  only  a  small  fraction  of  the 
possible  regressions  and  yields  the  “best  k  ”  subsets  (k  is  user  specified). 
Sequential  procedures  are  more  economical  than  the  all  regressions  approach 
in  that  they  try  to  find  the  “best”  regression  of  a  certain  number  of  variables. 
There  is  no  guarantee  that  these  methods  will  find  the  “best  regression”,  in 
fact,  different  sequential  procedures  often  yield  different  solutions. 

The  all  possible  regressions  and  best  k  subsets  procedures  use  a  specified 
selection  criterion.  There  is  nothing  from  preventing  us  from  applying  these 
criteria  in  a  sequential  fashion.  However,  we  will  only  discuss  sequential 


i 


procedures  that  are  based  on  a  sort  of  partial  F-test.  Our  discussion  of 
sequential  procedures  will  concentrate  on  l)  Forward  Selection,  2)  Backwards 
Elimination  and  3)  Stepwise  Regression.  First  we  discuss  the  following  five 
popular  selection  criteria:  1)  the  Sp  criterion,  2)  Akaike’s  Information 
Criterion  (AIC),  3)  Rp,  4)  and  5)  Mallow’s  Cp  criterion.  Next  we  discuss 
the  three  sequential  procedures.  We  will  end  our  discussion  of  variable 
selection  techniques  for  the  multiple  linear  regression  model  with  a  summary 
of  some  other  methods:  1)  Ridge  regression,  2)  Principal  Components 
Regression,  3)  Latent  Root  Regression,  4)  Press  and  5)  Inferential  techniques 
due  to  Aitken  and  McCabe. 

Selection  Criteria. 

The  Sp  Criterion.  The  first  selection  criterion  we  discuss  is  the  Sp 
criterion.  Thompson  (1978)  recommends  this  selection  criterion  as  preferable 
if  the  predictor  variables  and  response  can  be  taken  to  represent  a  (Q-fl)- 
dimensional  normal  distribution.  This  criterion  seeks  to  minimize  the 
expected  mean  square  error  of  prediction.  Following  Thompson,  mean  square 
error  of  prediction  is  given  by 

MSEP  =  Ey{y—y  p  )2, 

where  y,  yp  are  respectively  the  observed  and  predicted  values  of  the 
response  that  correspond  to  some  subset  of  size  p  ( p  <Q )  of  predictors.  It  can 
be  shown  that 
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where  rrp  is  the  residual  variance  of  the  p-variable  equation  and  T2  is  a 
Hotelling's  T~  statistic  that  is  a  function  of  the  regression  sample  and  the  p 
predictors.  If  we  take  the  expectation  of  MSEP  across  all  regression  samples 
and  predictors  sets  for  the  p  variables  we  get 
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p 


1+AT-r 


P(K+1) 
K-p  -2 


This  is  estimated  by 


Ep 


1+*+£l*±H  , 

K(K—p )  K-p -2  ] 


where  SSEp  is  the  sum  of  squares  due  to  error  from  the  p  variable  regression. 
After  some  algebra  and  recognition  that  K  is  fixed  for  a  particular 
experiment,  Ep  is  simplified  to 


SSEp 

[K—p  )(K—p  —2)  ’ 


Lindley  (1967)  offers  a  Bayesian  version  of  this  criterion. 

Akaike’s  Information  Criterion  (AIC).  Akaike  (1973)  proposed  a 


criterion  of  the  following  form 
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AICq  =  — 2log(  /  [y  ;x,^ )  )+2 q 


where  /  (y;x,6)  is  the  p.d.f.  (likelihood)  function  of  y  evaluated  at  0,  0  are 
the  maximum  likelihood  estimates  of  the  q  unknown  parameters  of  the  subset 
model.  Akaike  derived  this  criterion  from  information  theoretic 
considerations.  This  criterion  does  not  enjoy  widespread  acceptance  as  a 
selection  criterion.  Schwarz  (1978)  offers  a  Bayesian  version  of  AIC. 


The  Coefficient  of  Multiple  Determination  Rp.  The  coefficient  of 
multiple  determination  Rp  is  defined  (for  a  subset  of  size  p)  as 


SSEp 

SSTO 


where  SSTO  is  the  total  sum  of  squares  for  y.  The  quantity  SSTO  is 
constant  for  all  possible  regressions  and  SSEp  is  monotonically  decreasing  in 
p  (  a  useful  fact  exploited  by  Furnival  and  Wilson  (1974)),  hence,  we  do  not 
seek,  necessarily,  to  find  the  maximum  Rp  .  Here  we  are  looking  for  the  point 
where  adding  additional  predictors  is  not  worthwhile  because  of  a  small 
relative  change  in  Rp.  This  method  is  clearly  subjective. 

The  Adjusted  Coefficient  of  Determination  i?a2.  The  adjusted 
coefficient  of  determination  R^  is  defined  as 


SSEp 
SSTO  ' 


K- 1 
K-p 


This  criterion  takes  into  account  the  subset  size  p  and  penalizes  candidate 
predictors  whose  addition  does  not  adequately  reduce  SSEp.  Neter, 
Wasserman  and  Kutner  (1983)  suggest  graphical  procedures  for  both  Rp  and 
R-. 

Mallows’  Cp  Criterion.  Mallows  (1973)  suggests  a  criterion  based  on 
minimizing  MSEP  in  the  case  X  =  Xu  Xq  are  fixed.  Here  we  will  assume 
that  the  “true”  model  contains  all  Q  predictors.  We  seek  to  find  a  subset 
model  (although  biased  due  to  misspecification,  see  Hocking  (1976))  that 
provides  a  similar  MSEP  to  the  Q  variable  model  and  is  nearly  unbiased.  It 
can  be  shown  that 

MSEP(yt)  =  bias(yt)2  +  var(yt). 


If  we  total  the  mean  square  error  for  all  K  fitted  values  and  divide  by  a2,  the 
true  error  variance,  we  get 
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a  “standardized”  total  square  error  as  a  criterion.  It  can  be  shown  that  a 
good  estimator  of  Fp  is  Cp  where 


When  there  is  no  bias  in  the  Q  predictor  model.  E[Cp)  a  p.  If  we  plot  Cp 
vs.  p,  models  with  substantial  bias  will  •.■•nd  to  fall  significantly  above  the 
line  Cp  =  p  .  The  strategy  is  to  look  for  subsets  with  low  bias  and  small  Cp. 
Thompson  shows  Cp  is  closely  related  to  Rp  and  R2.  She  points  out  that 
the  C  procedure  tends  to  select  a  larger  set  of  variables  than  R2 


Sequential  Procedures 


In  this  section  we  will  discuss  three  sequential  variable  selection 
techniques:  1)  Forward  Selection,  2)  Backwards  Selection  and  3)  Stepwise 
Regression.  All  these  techniques  are  based  on  the  “extra  sum  of  squares” 
principle  as  related  through  the  following  theorem  given  in  Thompson  (1978). 


The  orem.  Given  the  linear  model 


Y  =  XJ  +  e, 


where  Y  is  Kxl  ,  X  is  KxQ ,  3  is  Qxl  with  e  a  Kxl  vector  or  residuals  such 
that  E(e)  =  0  and  var(e)  =  o2!^.  Then  3p^1  =  ...  =  3q  =0  implies  that 


SSiq-p'/Q  -P 


MSEr 


Q-p.K-Q 


where  SSa  _  =  SSRa  —  SSR _  and  SRRq  is  the  sum  of  squares  due  to 


regression  from  all  Q  predictors. 


We  note  that  SSq  _p  is  the  extra  sum  of  squares  due  to  the  Q-p  extra 
variables  in  the  model.  The  strategy  of  all  three  procedures  in  this  section 
will  involve  successive  partial  F-tests  that  test  the  contribution  to  total  sum 
of  squares.  These  tests  decide  whether  new  variables  enter  or  old  variables 
leave  the  model. 

Forward  Selection.  In  the  forward  selection  procedure  we  start  with 
no  variables  in  the  model.  The  algorithm  proceeds  as  follows: 

1.  Compute  the  sample  partial  correlation  coefficient  (w.r.t  y  )  for  all  the 
predictors  not  in  the  model. 

2.  Choose  as  the  entering  candidate  that  predictor  with  the  highest  partial 
correlation  with  the  response. 

3.  Perform  a  F-test  based  on  a  model  of  order  equal  to  the  current  number 
of  predictors  in  the  model  plus  one  for  the  entering  variable. 

4.  If  the  F  statistic?  is  significant  continue  with  step  1;  otherwise  terminate 
the  procedure. 

Backwards  Selection.  In  the  backward  selection  procedure  we  start 
with  all  Q  predictors  in  the  model.  The  algorithm  proceeds  as  follows: 

1.  Treating  each  predictor  as  the  last  to  enter,  compute 
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for  each  ith  variable  not  yet  deleted. 

2.  Choose  the  minimum  Fx  as  the  candidate  to  leave  the  model. 

3.  Compare  the  candidate’s  partial  F  to  a  critical  F. 

4.  If  the  partial  F  is  nonsignificant  we  delete  the  candidate  from  the  model 
and  continue  with  step  1;  otherwise  we  terminate  the  procedure. 

Stepwise  Regression.  In  the  stepwise  regression  technique  we  extend 
the  forward  selection  technique  to  allow  for  the  deletion  of  variables  at  each 
step.  The  algorithm  proceeds  as  follows: 

1.  Proceed  with  steps  1-4  of  the  Forward  Selection  technique. 

2.  If  a  variable  is  included  at  step  4  of  the  Forward  Selection  technique 
then  calculate  partial  F  statistics  for  all  the  variables  currently  in  the 
model. 

3.  If  a  partial  F  is  below  the  critical  value,  delete  this  variable;  go  to  step  1 
in  either  case. 

As  we  mentioned  earlier  there  is  no  guarantee  that  these  methods  will 
arrive  at  the  same  (  or  “best”  for  that  matter)  solution.  There  is  some 
evidence  that  backwards  elimination  is  superior  to  forward  selection  (Mantel 
(1970)).  Draper  and  Smith  (1981)  prefer  the  stepwise  procedure. 

Other  Techniques 

Ridge  Regression.  The  technique  of  ridge  regression  was  developed  to 
counter  the  effects  of  multicollinearity.  Multicollinearity  can  arise  when  some 
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subset  of  the  predic  r  variables  are  highly  correlated.  In  the  linear  model 


given  in  (2.6.1),  these  high  correlations  can  cause  the  matrix  X'X  to  be  nearly 
nonsingular.  Rather  than  use  the  standard  least  squares  estimators  for  3,  we 
introduce  the  biased  “ridge"  estimator  due  to  Hoerl  and  Kennard  (1970): 

3r  =  (X'X  -t-  cIK)~lX! Y  .  0  <  c  <  x:,  (2.6.2) 


where  c  is  an  arbitrary  constant  used  to  perturb  the  diagonal  of  the  X'X 
matrix  and  thereby,  hopefully,  eliminate  the  “ill-conditioning”  of  the  X 
matrix. 

In  practice  a  graphical  aid  called  a  ridge  trace  is  employed  to  help  find  a 
good  value  of  c.  The  ridge  trace  is  a  graph  of  the  estimated  coefficients 
(using  (2.6.2))  vs.  values  of  c  (typically  0  <  c  <  1).  The  ridge  trace  is 
examined  for  a  value  of  c  where  the  estimated  coefficients  stabilize.  This 
device  can  also  be  used  as  a  variable  selection  tool  by  identifying  those 
predictors  with  1)  unstable  ridge  traces  and  2)  coefficients  close  to  zero. 
Draper  and  Smith  (1981)  are  careful  to  point  out  that  this  technique  is  not 
usually  used  for  variable  selection.  Thompson  (1978)  objects  to  to  this 
method  on  two  grounds:  1)  the  method  is  arbitrary  in  that  it  lacks  a  specific 
criterion  and  has  no  stopping  rule  and  2)  the  relative  magnitudes  of  the 
predictor  variables  are  ignored. 

Principal  Components  Regression.  In  principal  components 
regression  the  approach  is  to  break  the  rows  of  the  X  matrix  into  its 
principal  components  (we  discuss  the  principal  components  technique  in  more 
detail  in  a  later  section)  and  retain  only  those  components  that  explain  the 
greatest  portion  of  the  “variance”  in  X.  The  technique  can  help  to  remove 


5 


the  problem  of  multicollinearity  (by  reduction  of  the  dimension  of  X)  but 
suffers  from  forcing  the  investigator  to  work  with  predictors  that  are  linear 
compounds  of  the  original  predictors.  These  linear  compounds  may  be 
difficult  to  interpret. 


Latent  Root  Regression.  This  method  represents  an  attempt  to 
improve  upon  the  principal  components  method.  In  latent  root  regression  we 
augment  the  “correlation”  matrix  of  X  (used  to  extract  the  components  in 
principal  components  regression)  w-ith  Y.  Now  we  extract  the  latent  roots 
(eigenvalues)  and  corresponding  latent  vectors  (eigenvectors).  Next,  we 
array  the  eigenvalues  and  eigenvectors  in  tabular  form.  We  look  for  pairs  of 
eigenvalues  and  that  coefficient  of  the  associated  eigenvector  that 
corresponds  to  Y  that  are  small.  Webster  (1974)  provides  guidelines  for 
smallness.  If  the  smallness  criterion  is  met,  the  eigenvector(s)  in  question  is 
(are)  candidates  for  deletion  from  the  model.  Next,  we  estimate  the 
parameters  for  the  subset  model,  back-transform  into  the  original  variables 
and  compare  to  least  square  estimates.  It  is  important  to  remember  that  the 
new  estimates  are  biased.  We  can  perform  candidate  examination  in  a 
method  similar  to  backwards  elimination  (details  in  Webster).  Draper  and 
Smith  do  not  recommend  this  method  to  the  majority  of  practitioners  due  to 
its  inherent  bias  and  complexity. 

PRESS.  Prediction  Sum  of  Squares  (PRESS)  is  a  hybrid  method 
proposed  by  Allen  (1971).  It  is  hybrid  in  the  sense  that  it  combines  all 
possible  regressions  and  residual  error  into  a  psuedo-jackknife  procedure. 
For  each  subset  of  size  p,  we  delete  one  observation  at  a  time  and  use  the 
remaining  observations  to  predict  the  deleted  response.  At  each  deletion  we 
compute  the  difference  between  the  observed  response  and  the  prediction. 
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This  quantity  is  called  the  predictive  discrepancy.  Once  through  the  data,  we 
sum  the  squares  of  the  predictive  discrepancies  and  proceed  to  t he  next  size 
of  subset.  When  all  the  subsets  have  been  exhausted,  we  array  the 
information  and  choose  a  mode!  with  a  low  sum  of  squares  and  subset  size. 
Draper  and  Smith  (1981)  point  out  that,  although  the  method  helps  detect 
influential  data  points,  there  is  no  set  stopping  rule  and  the  method  is 
computation-intensive. 

Inferential  Methods.  Aitkin  (1974)  points  out  that  application  of 
sequential  procedures  (forward  selection,  backward  elimination,  and  stepwise 
regression)  suffers  from  a  serious  drawback.  The  Type  I  family  error  rate  for 
the  sequences  of  dependent  F-tests  is  unknown.  Aitkin  offers  a  simultaneous 
test  procedure  that  controls  the  family  error  rate  in  the  subset  selection 
problem.  If  a  partition  of  X  =  (Xt  X2)  (  X  is  Kxpl  ,  X2  is  K*p2  and 
Pi  +  p2  =  <2+1  )  is  prespecified  then  it  is  well  known  that  if 

y  =  X,.!1,  +  X2S2, 

with  the  same  assumptions  as  in  section  2.6.2. 1,  then  a  likelihood  ratio  test 
for  -?2  =  0  is  based  on 

(fix  —  KxJ/Pa 
~  (l-Ri)/K-Q-l  ' 


where  /?x,  is  the  squared  multiple  correlation  of  y  with  the  predictors  Xj.  F 
has  a  noncentral  F  distribution  and  a  test  can  be  constructed  for  d2  =  0  by 
finding  the  appropriate  critical  point  of  the  null  distribution.  Aitken  finds  a 


simultaneous  test  for  all  partitions  Xj  by  noting 


r(xt)  =  P2f 


(Rx  -  Rxf 
(1  -  Ri)/K-Q- 1) 


(2.6.3) 


and  defining 


U 


maxx,  Lr(Xj) 


We  have  the  simultaneous  test  for  all  partitions  if  we  do  not  reject  a 
particular  partition  when 


(/(XJ  <C£tQ, 


where  C£  q  is  the  100a%  point  of  the  null  distribution  of  U.  Examination  of 
(2.6.3)  shows  the  maximum  of  [/(Xj  occurs  when  Xt  consists  only  of  the  first 
column  of  X,  that  is,  the  column  of  ones.  In  this  case 


U  = 


Rx 


(1  -R*)/[K-Q-1) 


U/Q  is  noncentral  F  and  the  simultaneous  test  that  does  not  reject  the 
hypothesis  =  0  for  an  arbitrary  partition  if 


Rx  -  Rl 


(i  -rD/k-q- \ 


<  QRq  ,K-Q~l(' 


Aitkin  calls  subsets  not  rejected.  R 


adequate. 


In  the  case  where  the  regression  equation  is  to  be  used  for  description 

rather  than  prediction,  he  oilers  simultaneous  testing  procedures  for  the  cases 

of  X  fixed  and  random.  He  calls  subset  models,  that  are  not  rejected  by  this 

procedure,  MSPE  (Mean  Squared  Prediction  Error)-adequate.  Let  E0  be 

defined  as  the  MSPE  for  the  full  model  when  X  is  fixed.  Let  El  be  defined  as 

the  MSPE  for  the  full  model  when  X  and  y  are  jointly  distributed  as  a 

multivariate  normal  distribution.  The  simultaneous  procedures  test  the 

hypothesis  HQ:  Et  —  E,  =  0  vs.  Hx:  Et  —  Et  <  0  (i=0,l,2  ,  Aiktin  includes 

the  case  i=l  where  X  is  taken  to  be  distributed  uniformly  over  x,  )  where 
* 

£j  is  MSPE  for  a  subset  model.  The  approach  is  to  classify  a  subset  as 
MSPE-adequate  if  the  MSPE  for  the  subset  is  not  significantly  larger  than 
the  MSPE  of  the  complete  model.  The  test  statistic  for  H0:  E0  —  E0  =0  is 


F  o' = 


■  ?>2Xo,1X,2,1J2 

CT~X  2  i  S  22  i  X2  1 


where  32  are  the  least  squares  estimates  (in  the  full  model)  of  32, 
X2  i  =  X2  —  Xo  —  52151~11(X1  —  Xj),  S12  is  the  matrix  of  sample  covariances 
between  elements  of  X;  and  X2,  S22  i  =  S22  —  anc^  ^  's  an 

estimate  of  the  residual  error.  The  simultaneous  test  which  does  not  reject 
the  hypothesis  H0  for  an  arbitrary  partition  if 

F3  <QF0Vk-q- i(  0. 


w  f  1  < 


re  /• 


0  K-Q-l 


is  the  100  >rc  point  of  a  noncentral  F  with  Q  and  K-Q-l 


degrees  of  freedom  and  noncentrality  parameter  of  1. 


The  test  statistic  for  // 0: 


=  0  is 


Rx 


(H  -Ri  ) 

1  -  Ri 


the  simultaneous  test  that  does  not  reject  the  hypothesis  H0  for  an  arbitrary 
partition  if 


R 


o 

X 


<  h 


[p.AK- 2)) 

pjk-pt-i 


(a) 


where  hp^x-p  1S  the  100a%  point  of  the  distribution  of  the  squared 
multiple  correlation  coefficient  based  on  p2  and  K—px— 1  degrees  of  freedom 
when  the  population  squared  multiple  correlation  is  p  2/{K—2). 

McCabe  (1978)  proposes  a  framework  for  variable  selection  called  a- 
acceptability.  Basically,  a  subset  is  considered  l-acceptable  if  the  parameter 
estimates  for  the  reduced  model  fall  into  the  100(1—  <y)%  confidence  region 
formed  by  the  full  model.  We  look  for  subset  models  that  are  “close”  to  the 
full  model.  Given  the  linear  model  of  (2.6.1)  it  is  well  known  that  a 
100(1—  a)%  confidence  region  for  3  is  given  by  (see  Johnson  and  Wichern 
(1982),  pg  304) 
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where  D(b)  =  {3  is  —  b  )'( AWY)(  b  ;s  —  b).  Here  <s  are  the  least  squares 
estimates  from  the  full  model  and  s"  is  the  estimate  of  residual  variance 
(s2  =  SSE(3  ,s )).  We  note  D(b)  is  the  squared  distance  of  any  estimator  b 
from  the  least  squares  estimator  3  ,s  in  the  units  provided  by  X'X.  Therefore 
values  of  b  for  which 


D{b)  <  dn. 


/here  dn  =  s2{Q  +1)Fq  -i(1— a))  are 


acceptable.  The  a- 


acceptable  subsets  form  a  collection.  If  one  of  the  subset  models  is  the  true 
model,  McCabe  provides  a  selection  rule  that  guarantees  that  the  probability 
that  the  correct  model  is  included  in  the  collection  is  greater  than  1— a.  The 
rule  is  to  include  all  subsets  for  which 


SSE(3 ,)  0-4-1 

SS£(.TJ  -  1  +  K—Q  — 1  F9  +  iW-c?-i(1~Q)> 


where  3 3  are  the  coefficients  in  the  subset  model.  This  rule  is  derived  from 
1)  the  fact  that  D(3S )  =  SSE(3s)  —  SSE{3is )  and  2)  for  any  3  =  3S  (the 
true  model  is  the  subset  model),  Pr,  (3  £  Sn)  >1  —  0:.  McCabe  notes  that 
Aitkin's  /?*-adequacv  selection  rule  can  be  written  as 


SSE(3a)  q 

SSE(3,s)  ~  1  +  K-Q-l  FQ’K-Q-i(l~a)' 


and  concludes  the  collections  of  subsets  obtained  using  inadequacy  are  not 
larger  than  those  subsets  obtained  using  ••  i-acceptability. 


.  -j.'*  •  v 


.v'vy-r. 


'•r.  V. 


wmm 
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2. 6. 2. 2  Multivariate  Linear  Regression  Model 

Basic  Model.  The  multivariate  linear  regression  model  takes  on  the 
following  form 

T,  =  Xx  3  +  e,-  ,  1  <  i  <  K 

where  Yt  is  the  ith  independent  observed  response  vector  of  dimension  m,  X j 
is  the  ith  observed  lx(Q+l)  row  vector  of  predictors  (with  a  1  in  the  first 
column),  3  is  a  (Q+l)xm  column  vector  of  unknown  parameters  and  e,  is  the 
ith  lxm  vector  of  residuals.  We  assume  that  £(e,)  =  0  and  cov(et)  =  E, 
1  <  i  <K.  We  note  that 

E[Y\  =  X3 


and 


! 


varjyj  = 


if  we  can  take  X  as  fixed.  As  in  the  multiple  regression  case,  we  must  be 
I  careful  to  distinguish  between  the  case  when  X  is  fixed  and  the  case  when  X 

is  a  random  matrix.  We  can  use  conditioning  arguments  to  show  that  we 
may  estimate  the  parameters  of  the  multivariate  linear  regression  model 
I  using  the  same  formulae  for  both  random  and  fixed  predictors.  If  we  array 

all  the  K  observations  we  have 


Y  =  XJ  + 


I 


>.vy. 
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where  Y  is  Kxm,  X  is  Kx(Q  + 1),  3  is  (Q+l)xm  and  e  is  Kxm.  The  least 
squares  solution  for  3  is 


*  k 

A’ 


3  =  (X'X)_1XY, 


and  an  unbiased  estimator  of  _  for  both  methods  is  given  as 

v  =  (Y  -  X.?)'(Y  -  X3) 

K-Q-l 


(2.6.4) 


i:- 


see  Anderson  (1984),  theorem  8.2.1,  pg.  289. 


Selection  of  Variables 


Relatively  little  has  been  written  concerning  the  selection  of  predictor 
variables  in  the  multivariate  linear  model  context.  Many  papers  have  been 
written  about  variable  selection  in  the  closely  related  discriminant  analysis 
problem;  Seber  (1984)  provides  a  review. 

Siotani,  Hayakawa  and  Fujikoshi  (1985)  offer  multivariate  extensions  to 
Cp  and  AIC.  They  also  show  how  the  forward  selection  technique  can  be 
extended  to  the  multivariate  case.  Let  RSSp  be  the  mxm  matrix  of  residual 
sums  of  squares  and  cross-products, 


t&il 


RSSp  =  (Y  -  X3)'(Y  -  XJ). 


Siotani  et  al.  generalize  the  Cp  criterion  to  m  responses  as 


Cpm  =  RSSp[  +  2 mp  -  Km, 


where  H  is  as  given  in  (2.6.4).  They  generalize  AIC  as 

RSS 

AICm  =  K* log  | - -  |  +  2 mp. 

K 


To  extend  the  forward  selection  technique,  Siotani  et  al.  suggest  a 
generalization  of  the  F-test  using  the  Wilks  V  statistic.  Let 

'  p  |  RSSp  |  ’ 

I 

in  this  case  p-1  variables  have  been  selected  and  the  pth  variable  is  being 
considered  as  a  candidate  for  entry.  If  the  pth  variable  affords  a  significant 
reduction  in  generalized  residual  variance  then  it  will  enter  the  model. 
Testing  for  nonsignificant  reduction  in  generalized  residual  variance  is 
equivalent  to  testing  if0:  j3p  =  0.  Under  H0, 

K-p-m  1  ~  AP 

rn  Ap  -  *rn,. K-p-m, 


so  once  again  we  perform  F-tests  on  candidates  for  entry.  If  the  candidate 
enters,  we  choose  the  next  candidate  by  finding  that  predictor  that,  along 
with  the  p  variables  currently  in  the  model,  minimizes  j  RSSp  j. 


McKay  (1977)  extends  Aitkin’s  (1974)  /?2-adequacy  procedure  to  the 
multiresponse  case.  He  explores  the  regressions  between  predictors  and 
subsets  of  responses.  This  is  an  interesting  approach,  in  that  it  allows  one  to 
think  in  terms  of  candidate  responses  as  well  as  candidate  predictors. 
McKay  proposes  several  procedures,  the  most  intuitive  being  a  likelihood 
ratio-based  simultaneous  testing  procedure.  In  this  procedure  selection  is 
based  on  the  squared  canonical  correlations  between  response  and  predictor 
subsets.  This  methodology,  as  well  as  those  procedures  due  to  Aitkin,  are 
applications  of  the  simultaneous  test  procedures  offered  by  Gabriel  (1969). 

McKay  reasons  that  the  amount  of  information  about  the  variation  in  a 
set  of  responses  provided  by  a  set  of  predictors  is  reflected  in  their  squared 
canonical  correlations.  He  shows  that  any  subset  of  predictors  can  contain 
the  same  amount  of  information  as  the  full  set  if  and  only  if  the  slope 
coefficients  of  the  deleted  variables  are  zero.  Let  Y  ,  X  be  jointly  distributed 
as 


■N, 


m+Q 


l*  Y 


1 


V  V 

— YY  — YX 

V  v  , 
—XY  —XX 


further  partition  X'  =  (X1  f  X'g),  Y  =  (Y'v  Y'w),  where  f  Ug  =  s  , 
s  =  l,...,m  and  uUtu  =  u  ,  u  =  ■  This  partitioning  is  arbitrary  in 

the  sense  that  we  allow  any  pair  of  subsets  to  be  represented.  McKay’s 
strategy  is  to  apply  one  of  Gabriel’s  simultaneous  testing  procedures  to 
hypotheses  of  the  form  cuU3:  3VS  =0  and  of  the  form  uJvg  /=  0,  where  wvgf 
refers  to  a  hypothesis  in  which  the  3  matrix  has  already  had  those  columns 


correspondin'’  to  :he  par’ i- ion  g  zeroed  out.  To  get  a  simultaneous  test  with 
n redetermined  Type  I  family  error  rate,  we  test  all  such  hypotheses  versus 
the  same  critical  value.  The  critical  value  arises  from  the  “overall" 
hypothesis,  that  is  that  there  does  not  exist  a  multivariate  linear  regression 
between  Y  and  X.  A  likelihood  ratio-based  simultaneous  testing  procedure  is 
based  on  statistics  of  the  following  form. 


When  the  hypothesis  is 


= 


‘“’cv  ■A's  As 


'5 

ss  sv 


!  $vv 


where  Svv,  for  instance,  is  the  sample  covariance  matrix  of  the  v  responses. 
When  the  hypothesis  is  uivg  j  , 


U»?-/  = 


I  C  _  C  C  -1  c  I 

I  WVU  '-'vs  ^ 3S  ^ SV  I 

~  SvfSfflSfv 


We  reject  a;vs  when  Uvg  f  >  q.K-q-i>  where  ^.q.k-Q-i  »  the  > 
percentage  point  of  the  distribution  of  a  random  variable  distributed  as 


where  G  is  distributed  as  a  Wishart  distribution  with  K—Q—l  degrees  of 


freedom,  H  is  distributed  as  a  Wishart  distribution  with  Q  degrees  of  freedom 
and  G  and  H  are  independent.  McKay  defines  ‘.-adequate  subsets  of 
predictors  (A' j )  for  the  subset  of  responses  (Xv )  as  those 


CHAPTER  3 


CONTROL  VARIATE  SELECTION  CRITERIA 


In  this  chapter  we  derive  control  variate  selection  criteria  for  two  cases. 
First,  we  derive  a  criterion  for  the  case  when  the  covariance  matrix  of  the 
controls  is  estimated.  Next,  we  develop  a  new  estimator  that  directly 
incorporates  the  use  of  a  known  covariance  matrix  for  the  controls.  Finally, 
we  present  a  selection  criterion  based  on  the  new  estimator. 

3.1  A  Selection  Criterion  When  the  Covariance  Matrix  of  the 
Controls  is  Estimated 

In  this  section  we  derive  a  criterion  for  use  in  the  selection  of  control 
variates  when  the  object  is  the  immediate  construction  of  confidence  regions 
about  the  mean  vector  of  responses.  We  demonstrate  that  this  criterion  acts 
as  a  multivariate  extension  of  the  univariate  selection  criterion,  S  . 

Our  objective  is  the  immediate  construction  of  controlled  confidence 
regions.  We  use  the  word  “immediate”  to  mean  that  we  use  only  the  data  at 
hand,  we  do  not  make  additional  replications.  It  seems  reasonable  to  pick 
that  subset  of  controls  which  yields  a  confidence  region  of  minimum  expected 
volume.  However,  for  mathematical  efficacy,  it  is  more  convenient  to  consider 
minimizing  the  expected  squared  volume  of  the  confidence  region. 


"~W 


First,  we  will  consider  the  case  of  a  single  response  and  show  how  the 
the  appropriate  criterion  is  a  modified  form  of  the  Sp  criterion.  In  the  section 
following,  we  extend  the  criterion  to  multiple  responses. 

3.1.1  Univariate  Response 

In  one  dimension,  the  squared  volume  of  the  confidence  region  reduces  to 
the  squared  width  of  the  usual  controlled  confidence  interval.  Let 
correspond  to  the  optimal  width  of  a  confidence  interval  constructed  from 
j<Q  controls  with  a  signifcance  level  a.  We  seek  to  find  that  subset  of  size 
j  such  that  we  minimize  the  expected  squared  length  of  the  confidence 
interval:  in  symbols 

minj  E 

Now 

(Wi)2  =4^_y_1(l—  a/2)v?\su 


(3. 1.1. 2) 


(W  ’)* 


(3. 1.1.1) 


as  developed  in  (2.2.39).  We  wish  to  compute 


min,,  4 {l-a/2 )E 


(3.1. 1.3) 


but 


E  ^2|Slll  =  var (Y(0)), 


(3. 1.1.4) 


by  the  arguments  of  (2.2.33)  to  (2.2.39)).  Now,  equation  (2.2.30)  gives 


,,-,<5^  a~  K— 2 

var(Y(j))  =  -  - 

V  v  11  K  K-j-2 


(3. 1.1.5) 


If  we  estimate  of  by 


;2_  me, 

'  K-j-l  ’ 


(3. 1.1.6) 


then  some  algebra  and  recognition  of  those  terms  that  are  constant  across  all 
subsets  yields  a  selection  rule  of 


miny  t£_j_x  (l-a/2) 


(3.1. 1.7) 


{K — j —l)(K — j —2)  Sp 


(3.1. 1.8) 


(actually  Sp  is  S.  but  we  use  p  for  the  dimensionality  of  Y).  The  selection 


rule  is 


minJ  <i_;_1(l-or/2)5p. 


(3.1. 1.9) 


We  note  that  if  K  j  then  the  criterion  reduces  to  S„. 


3.1.2  Multivariate  Response 

To  extend  this  procedure  to  p  responses,  we  seek  the  subset  which  yields 
the  minimum  expected  squared  volume  of  the  confidence  region.  Rao  (1967) 
gives,  under  the  multivariate  normal  assumption,  the  100(1— a )%  confidence 
ellipsoid  for  fiy  < 


Pr  j  My)'^y  !a-(y(3)— /1y)  <  (d'd)C0FP)A-_Q_p  (l-'O  |=  1— a, 


(3. 1.2.1) 


where 


C0  =  {(K-Q-l)p/(K-Q-p)} 


d'  -  I'k/K  -  (X— /ix)'(G'G)~lG', 


(3. 1.2.2) 


where  G  is  defined  in  (2.4.25)  and  1  %  is  a  K-dimensional  column  vector  of 


ones.  Also  we  have 


^y|x  —  (^yy— ^yx^xx^xy)- 


(3. 1.2.3) 


Let  (V^2  denote  the  squared  volume  of  the  confidence  region  constructed 


with  j<Q  controls  at  significants  level  a.  We  seek  to  compute 


min;  E(VJ)2. 


(3.1. 2.4) 


*V‘ -  •*.  v\*".  v-  '  ■  -  ■/  v  -  *’  .  .*  \m  t 


I  A' 


)  -rv  —  -y'A'-.YA'  -AT  I 


-YX-XX-XY  —YY 


'YY 


!  —YY 


n(i 

!  =1 


P,2), 


(3.1.2.10) 


where  u  =  rank  (Zyjy).  Now,  |  Eyy  |  is  not  a  function  of  j;  and  noting  this 
and  replacing  the  canonical  correlations  with  estimates  we  get  a  criterion  of 


C, 


K-J-l 

K-j-p 


Fp,K-]-p^~a) 


n(i  -p.2)- 

t=-i 


(3.1.2.11) 


3.2  A  Selection  Criterion  When  the  Covariance  Matrix  of  the 
Controls  is  Known 

Situations  arise  in  discrete  event  simulation  where  the  covariance  matrix 
of  control  variates  is  either  known  a  priori  or  can  be  computed  with  relative 
ease.  Several  authors  have  suggested  such  controls  for  the  class  of  closed 
queueing  networks  studied  by  Lavenberg  et  al.  (1982).  Wilson  and  Pritskrr 
(198 ‘a, b)  and  Venkatraman  (1983)  propose  standardized  control  variables  for 
these  systems.  In  addition  to  having  an  asymptotically  known  mean,  the 
controls  offered  by  Wilson  and  Pritsker  also  have  an  asymptotically  known 
covariance  matrix.  Venkatraman’s  controls  have  a  mean  vector  known 
exactly  with  a  covariance  matrix  that  is  also  known  asymptotically.  In 
Chapter  4  of  this  research,  we  offer  a  new  class  of  controls  for  which  both  the 
mean  vector  and  covariance  matrix  are  known  asymptotically.  To  emphasize 
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the  potential  diversity  of  situations  where  the  covariance  matrix  is  known, 
we  cite  the  so-called  “path”  controls  offered  by  Venkatraman  and  Wilson 
(1985).  These  controls  arise  in  the  simulation  of  stochastic  activity  networks. 
We  believe  that  there  is  a  large  class  of  simulated  systems  for  which  such 
controls  can  be  developed. 

In  this  section  we  develop  a  controlled  estimator  Y(~<)  that  directly 
incorporates  the  known  covariance  matrix  of  the  controls.  We  derive  its 
various  properties  and  develop  a  selection  criterion  based  on  this  estimator. 


3.2.1  The  Estimator  Y(~/) 


In  Section  2.4  we  introduced  the  estimator 


Y(3)  =  Y—  3{X-nx)  , 


(3.2.1. 1) 


where  3  —  SYxsxx-  Here,  as  in  Section  2.4,  Y  is  a  pxl  vector  of  responses, 
3  is  the  estimated  p  xQ  matrix  of  control  coefficients  and  X  is  a  Qxl  vector 
of  controls  with  mean  vector  ^x-  In  this  section  we  will  introduce  the 
estimator 


Y(y)  =  Y-7  (X— mx) 


(3.2. 1.2) 


where 


- .  c  v  —  l 

'  —  ^YX—XX 


(3. 2.1. 3) 


This  estimator  incorporates  a  known  covariance  matrix,  ,  for  the 

controls. 


Provided  that  the  responses  and  controls  are  jointly  normal,  Y 
unbiased  estimator  for  //y  ■  To  show  this  we  write  Y(~)  as 
combination  of  the  K  observed  responses.  Let 


G  = 


{Xi  -  X)' 


(XK  -  T/ 


and  define 


H  —  K  *1 x  ~  [X— 1)  _yx(X— /;x) 


We  observe 


l^-'G  —  0,  —  1, 


yv  •  •  • ,  y 


K 


G  —  [K— l)5y_y  , 


and  so  we  write 


Y b)  = 


Ylt  ■  •  •  ,  Y 


K 


H  . 


Now,  define  Z  =  vec  X  =-  vec  (Xt,  ■  •  •  ,  XK)  so  that  Z  is  1 1 
dimensional  column  vector  formed  by  stacking  the  Q-dimensioua 


{X}\  one  upon  another.  Now,  the  condition  -  A.  =  r. 


'"<)  is  an 
a  linear 


(3.2. 1.4) 


(3.2. 1.5) 


(3.2.1. 6) 


(3. 2. 1.7 

te  i\>. . 
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compactly  expressed  as  Z  =  z.  Also,  define  Mz  =  vec  (1  x®Mx)-  We  have 
^[y(7)  =  Ex  Ey  vec  YH  |  Z  =  2J  J  (3.2. 


=  £x[(H'®/p)£r  vec  Y|  Z  =  2]] 

Now,  assuming  joint  normality  of  the  responses  and  controls,  we  have 
=  Ex  (H'®/p)  |jlx  ®My )  +  {Ik  ®£yx£xx)(2  ~  M?)  Jj 


—  U®My)  +  (H'S-rx^xxX2  ~  M z ) 


—  Ex  Mr  +  (H'^^yX^XX-)2  —  (H'®Dyx^'Xx)(1iC  ®Mx) 


—  Ex  Mr  +  (^rx^xxXE)  —  (l®Eyx£xxMx) 


=  Mr  +  ^x^rx^xx™)  —  (Syx^xxMx) 


=  Mr  +  ^yx^xx-^x  —  (Syx^xxMx) 
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—  /iy  +  ^yjsf^jci'-Sx’  [x  ~  (-K— l)  1XGEj!Ci[r(X—  Mx)  ]  —  i^YX^XX^x) 

—  Mr  +  &x  ® jot  ^jqt(X  Mx)  i 

and  under  the  assumption  of  multivariate  normality,  we  have 

=  Ex  [sjor^ja:  Ex  ^(X—  /ix)  = 

Ex  [s.iQrSxvr(^— Mx)|  =  0  (3.2. 1.9) 

_  _  *. 

since  is  independent  of  X  in  this  case.  Hence  the  estimator,  Y( 7),  is 

_ 

unbiased.  The  covariance  of  Y(^)  is  given  by 


yti)\  = 


K+Q  1  y  Q  +1  y  ^_1V 

K(K- 1)  r,x+  K(K-l)  ZyxZxxZxr 


(3.2.1.10) 


The  derivation  of  (3.2.1.10)  is  Appendix  1.  Algebraic  manipulation  of  the 


above  reveals 


Y^)  = 


K  +  Q-l  y 


K- 2 
K+Q-l 


■  Zyy  LyX^XX^XY  . 


(3.2.1.11) 
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hence  the  variance  ratio  is  given  by 


*  det(cov(  Yn  ; 


det(cov(Y)) 


(3.2.1.12) 


K+Q-l  JL  ,  K- 2  2 

k- i  a:+q  -i Pt  ’ 


(3.2.1.13) 


where  ^  =  rank  Let  p*  be  the  smallest  canonical  correlation,  then 


V2  < 


K+Q-l 
K- 1 


K-2  2 

1  — - pi 

K+Q-l 


(: 3.2.1.14 ) 


Now  if  v  =  p,  a  resonable  condition  corresponding  to  well-picked  responses 
and  at  least  p  well-picked  controls,  then  the  right  hand  side  of  (3.2.1.14)  can 
be  shown  to  be  less  than  1  (see  Appendix  2)  if 


P*  > 


K-2 


(3.2.1.15) 


We  have  shown  that  it  is  possible  to  achieve  a  reduction  in  the 

_  A  _  A 

generalized  variance  using  Y(7)  as  an  estimator.  However,  for  Y(-y)  to  be  of 
practical  significance  we  require  it  to  reduce  the  generalized  variance  relative 

_  A  __  A  __  A 

to  its  competitor  Y{/3).  To  contrast  the  estimators  Y("y)  and  Y(3),  we  form 
the  ratio 


ofoC  vv i  ’oWo/oV  o’/.'1'  ofvf  - 
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~  =  det(cov(Y(7)))  = 
det(cov(T(/3))) 


K+Q-l 

r  - 

n 

t-X 

K-  2  , 

K- 1 

K+Q-l  Px 

/-  - 

CM 

1 

* 

CM 

Cu 

1 

t-H 

K-Q-2 

(3.2.1.16) 


Using  mathematical  induction  and  assuming  that  u  =  p  (see  Appendix  3),  we 
can  show 


P*  > 


(K+Q-l)(K-Q-2) 

(K-l)(K-2) 


-  1 


(K+Q-l)(K~Q-2) 

K—2 

(K-l)(K-2) 

K+Q-l 

V  <  1 


-  1 


(3.2.1.17) 


We  have  shown,  subject  to  conditions  on  the  canonical  correlations,  that 
reduction  in  the  generalized  variance  of  the  estimator  is  possible.  The  next 

__  A 

step  is  to  create  a  100(1 —Qt)%  confidence  region  based  on  Y^). 

We  construct  a  100(1— a)%  confidence  region  based  on  the  following 
considerations.  We  assume 

^T)  ~  Np  (/Uy ,  E)  ,  (3.2.1.18) 


where  E  is  given  by  (3.2.1.10).  Let  E  be  an  estimator  of  E,  further;  assume 

A 

( n-Q)E  ~  Wp  (n  — Q  ,  E)  ,  (3.2.1.19) 
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where  Wp(n—Q  ,  E)  is  a  random  matrix  distributed  as  Wishart  with  n—Q 
degrees  of  freedom  and  expected  matrix  E;  here  n  =  K— 1.  Now  if  Y(~/)  is 

a 

independent  of  E  and  we  define 


T2  =  (Y('y)-/iY)'S-1(Y('y)-MY)  , 


(3.2.1.20) 


we  can  show  (see  Muirhead  (1982),  Theorem  3.2.13,  pg.  98) 


T2  K-Q 
K—Q — 1  p 


Fp,k-p-q  t 


(3.2.1.21) 


where  Fp  k-p~q  is  a  random  variable  distributed  as  a  central  F  with  p  and 
K—p—Q  degrees  of  freedom  respectively.  We  can  form  a  confidence  region 
based  on 


Pr  |  (Y('7 )— /zy)'E-1(Y(7 )— /JY)  <  C0FPiK_Q_p(l-a)\=  l-o, 


(3.2.1.22) 


where 


C0  =  {(K-Q-l)p/(K-Q-p)\ 


nr 


3.2.2  A  Selection  Criterion 

As  in  the  previous  section,  our  objective  is  the  immediate  construction  of 
controlled  confidence  regions.  Again,  we  will  seek  that  subset  of  controls 
that  yields  a  confidence  region  of  minimum  expected  squared  volume. 
Expression  (3.2.1.22)  is  used  to  construct  the  100(1— a)%  confidence  ellipsoid 
for  Hy  Let  (V^)2  correspond  to  the  optimal  squared  volume  of  the 
confidence  region  constructed  with  j  <  Q  controls  at  significance  level  a. 
We  seek 

min,  E{Vi)2.  (3.2.2.1) 

v 

;  now,  it  can  be  shown  that  the  squared  volume  of 

this  ellipsoid  is  given  by 

(V’Y  -  p-*C2(p ) | ij I T,  (*■,*_,■_,(  1-a)  f ,  (3.2.2.2J 

A 

where  C(p)  is  given  by  (2.4.13).  Assume  (K—l  — j)£,  ~  Wp  (K  — 1— j ,  £). 

a 

Here  E,  depends  on  the  subset  of  j  controls.  Using  Theorem  3.2.15  of 
Muirhead  (1982),  we  calculate  the  expected  value  of  |  E,  |  as 

|E;|C4,  (3. 2. 2. 3) 


Let  Tj  = 


p(K-j-l) 

K-j-p 


where 


for  K—j—l  >  0.  So 


E  [(V >)! ]  -  p"!CJ(p )  |  £,  |  C(r;  (f, (1-c) ^ 


and  since  p  is  fixed  for  all  j,  we  seek 


Un>  K-j-P 


We  do  not  know  £  ■  so  we  estimate  it  as 


j  1  y  ,  j ~H  ( y  y 

K(K-l)hr'x+  K(K- 1)  [Srr-Sr\x 


where 


^y\x  —  (^rr  —  syxsxxsxy  ]  » 


EyY  =  Syy  , 


where  Syy  is  defined  in  equation  (2.4.17). 


mmSm 
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CHAPTER  4 

IMPLEMENTATION  OF  THE  SELECTION  CRITERIA 
IN  QUEUEING  NETWORK  SIMULATION 


In  this  chapter  we  discuss  the  evaluation  portion  of  the  research.  First, 
we  describe  a  class  of  closed  queueing  networks  that  were  used  as  the 
experimental  vehicles  of  this  research.  Then,  we  describe  an  experimental 
procedure  in  which  a  series  of  simulation  metaexperiments  were  carried  out 
to  evaluate  the  performance  of  the  multivariate  selection  criteria.  Here  we 
discuss  the  system  responses  investigated  and  the  candidate  controls.  Next, 
we  discuss  the  the  performance  measures  employed.  Finally,  we  discuss  the 
methodology  used  to  obtain  the  optimal  subset  of  controls  in  each  basic 
experiment. 

4.1  Description  of  the  Simulated  Queueing  Networks 

We  chose  four  different  queueing  systems  as  experimental  vehicles. 
These  queueing  systems  were  suggested  by  Lavenberg  et  al.  (1982).  These 
systems  are  members  of  a  broad  class  of  closed  queueing  networks.  There 
are  several  major  advantages  to  choosing  such  systems.  The  first  major 
advantage  lies  in  the  fact  that  these  systems  have  been  studied  extensively 
and  workable  controls  have  been  developed  by  several  authors.  Lavenberg  et 
al.  (1982)  have  developed  a  set  of  control  variables  that,  in  the  univariate 
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response  case,  have  produced  significant  variance  reductions.  Wilson  and 
Pritsker  (1984a,  b)  show  how  to  modify  these  controls  to  insure  asymptotic 
stability.  Venkatraman  (1983)  offers  a  standardization  scheme  for  one  subset 
of  the  controls  of  Lavenberg  et  al.  Another  advantage  in  using  these 
networks  is  that  they  are  representative  of  a  class  of  queueing  systems  which 
are  frequently  analyzed  in  computer  performance  modeling.  Finally,  the  two 
simpler  networks  we  employ  have  a  steady-state  behavior  which  can  obtained 
analytically.  This  information  was  of  great  value  for  validation. 

As  outlined  in  Lavenberg  et  al.,  the  queueing  systems  considered  take 
the  following  form.  Consider  a  finite  set  (say  of  size  5)  of  interconnected 
service  centers.  These  centers  service  D  different  types  of  customers.  There 
are  a  total  of  N  customers  of  all  types.  .Assume 

1.  Markovian  routing  so  that  the  next  station  visited  only  depends  on  the 
current  location. 

2.  The  service  times  for  the  the  jth  type  of  customer  at  the  ith  service 
station  are  drawn  independently  from  a  given  distribution  F,y(*)  with 
finite  mean  and  variance. 

3.  Service  time  sequences  and  sequences  of  centers  visited  are  mutually 
independent. 

There  are  two  basic  types  of  networks  to  be  considered  in  this  general 
setting.  Figure  1  portrays  the  form  of  the  first  type  of  simulated  network. 
Service  center  1  has  N  servers,  where  N  is  the  total  number  of  customers  of 
all  types.  We  can  think  of  this  service  center  as  a  room  filled  with  N 
interactive  computer  terminals.  The  service  centers  labeled  2,  ■  •  •  ,  5  are 


Figure  1.  Type  I  Network 


■‘IvXv. 


iii 


CmW 


single  server  queues  with  the  customers  being  served  in  order  of  arrival.  We 
can  think  of  service  center  2  as  a  central  processing  unit  (CPU)  with  service 
centers  3,  •  •  •  ,  S  as  peripheral  devices  to  be  accessed  by  the  CPU.  The  S  by 
S  transition  matrix  that  characterizes  the  flow  of  customers  in  the  network 
has  the  form 
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where  pk(d)  k  =  1,  ■■■,  S  is  the  one  step  transition  probability  from  service 
center  2  to  the  remaining  centers  (for  a  customer  of  type  d).  In  this  type  of 
network  we  have  made  the  implicit  assumption  that  every  customer  that 
requests  service  from  the  CPU  is  immediately  granted  his  requisite  memory 
allocation.  In  real  world  interactive  computing  environments,  customers  often 
must  queue  for  memory  at  the  CPU.  This  blocking  effect  due  to  memory 
limitations  of  the  CPU  is  modelled  by  the  next  type  of  network. 

We  refer  to  this  second  class  of  systems  as  networks  with  subnetwork 
capacity  constraints.  The  CPU  and  associated  peripherals  are  the 
subnetwork.  A  network  of  this  type  is  portrayed  in  Figure  2.  The  dotted 
line  encloses  the  subnetwork.  Service  center  2  is  now  merely  a  queue  for  the 
subnetwork  with  capacity  N'  <  N  customers.  There  is  no  service  time 


In  this  case  the  5  by  5  transition  matrix  takes  the  form 
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In  our  experiments  we  chose  4  networks  from  Lavenberg  et  al.  (1982). 
Two  of  the  networks  were  of  the  first  general  type,  i.e.,  no  subnetwork 
capacity  constraints.  The  other  two  networks  had  subnetwork  capacity 
constraints.  The  networks  are  parameterized  in  the  tables  below. 


Table  4.1  Parameters  of  Queueing  Systems  Used 
in  the  Experimental  Evaluation 


Network 

No. 


N 

No.  customers 


Subnetwork 

Capacity 

N' 


Number  of 
Service  Centers 


We  chose  the  first  two  networks  because  Lavenberg  et  al.  presented  detailed 
results  of  their  experiments  using  these  two  networks.  Also,  these  two 
simpler  networks  display  a  steady  state  behavior  that  can  be  obtained 
analytically.  Results  for  the  networks  3  and  4  were  presented  in  Lavenberg, 
Moeller,  and  Welch  (1979). 

4.2  Layout  of  the  Simulation  Experiments 

In  this  section  we  discuss  elements  of  the  experimental  layout.  First,  we 
briefly  describe  the  relationship  of  our  basic  experiment  to  an  overall 
metaexperiment.  In  the  next  section,  we  discuss  the  selected  system 
responses.  Following  that,  we  list  the  selected  control  variables.  Finally,  we 
discuss  the  selected  performance  measures. 

4.2.1.  Composition  of  the  Metaexperiments 

A  basic  experiment  consisted  of  a  set  number  of  independent  replications 
of  the  simulation  model.  We  chose  two  replication  levels,  20  and  40.  Within 
a  basic  experiment  a  selection  procedure  was  employed  to  obtain  the  “best” 
subset  of  controls.  We  discuss  the  selection  procedure  in  a  later  section.  An 
overall  metaexperiment  was  conducted  for  each  network.  This  meta¬ 
experiment  consisted  of  50  independent  replications  of  the  basic  experiment. 
Therefore,  when  the  replication  level  of  a  basic  experiment  was  40,  we  ran 
40x50  =  2000  independent  replications  of  the  basic  experiment. 


4.2.2.  Selected  System  Responses 


We  elected  to  look  at  a  two  dimensional  response  across  all  the  models. 
We  were  interested  in  the  response  vector  ^CPui Oj  where  R{t)  is  the 

mean  response  time  for  a  system  request  accumulated  up  to  simulated  time  t 
and  Ucpu(t )  is  the  utilization  of  the  CPU  accumulated  up  to  simulated  time 
t.  These  responses  seem  to  make  good  sense  from  both  a  system  and 
customer  viewpoint.  The  customer  is  most  interested  in  the  response  time  of 
the  system  and  the  system  adminstrator  is  probably  most  concerned  with  the 
utilization  on  the  CPU,  since  it  is  probably  by  far  the  most  expensive 
component  of  his  system. 

4.2.3.  Selected  Control  Variables 

We  considered,  as  candidates,  modifed  versions  of  the  control  variables 
proposed  by  Lavenberg  et  al.  (1982),  as  well  as  as  new  control  of  our  own 
design.  These  control  variables  can  be  classified  into  three  basic  types:  1) 
service  time  variables,  2)  flow  variables,  and  3)  work  variables.  All  of  these 
variables  are  collected  at  each  service  center  for  each  customer  type.  In  the 
form  suggested  by  Lavenberg  et  al.,  service  time  variables  are  the  sample 
mean  service  times.  Flow  variables  are  the  sample  proportion  of  departures 
from  particular  centers  relative  to  the  total  number  of  departures  from  all 
centers.  Work  variables  are  the  product  of  the  service  time  variables  and  the 
flow  variables. 

Lavenberg  et  al.  calculated  the  asymptotic  means  for  these  controls.  In 
their  analysis,  they  assumed  that  the  run  lengths  of  their  models  were 
sufficient  to  warrant  the  use  of  the  asymptotic  means.  We  chose  to  use 


“standardized”  forms  of  the  service  and  work  variables  of  Lavenberg  et  al. 
We  developed  a  standardized  flow-type  variable  based  on  the  multinomial 
routing  of  each  network.  We  used  “standardized”  controls  in  an  effort  to 
avoid  numerical  difficulties  that  result  from  unit  of  measurement  differences. 

We  considered,  as  candidates,  service  time  variables  of  the  type 
proposed  by  Venkatraman  (1983).  For  service  center  j  and  customer  type  d, 
define 

XjA‘)  -  (<r1/!  "e  V^.i  w  -  , 

/-i 


where:  g(j,  d;  t )  is  the  number  of  service  times  started  at  station  j  for 
customer  class  d  during  the  simulated  interval  (0,  t]  (  here  0  marks  the  start 
of  the  statistics  collection  period  );  and  U :  d/  /  is  the  l  th  service  time 
sampled  at  station  j  for  customer  class  d,  where  E{Ujd  (  )—yijd  and 
v*r [U]idJ  )  =  cr2hd. 

Venkatraman  (1983)  shows  that 

£[X(0!  =  0V<  >0 


where 


x(0  =  [xl(<),  •••  ,x,(0] 


and  since 


where  - *■  signifies  convergence  in  distribution  and 

t  — *oc 


q  =  diag(aj,  a  ) 


* 


ak  =  (Steady-state  utilization  of  station  k)/fi k 


(note  that  we  have  dropped  the  customer  class  distinction).  Venkatraman 
provides  guidance  for  the  computation  of  ak. 


Among  the  candidate  control  variables  are  versions  of  the  work  variables 
due  to  Lavenberg  et  al.  that  have  been  suitably  “standardized”  as  proposed 
by  Wilson  and  Pritsker  (1984a,  b)  (see  equation  2.2.48).  This  class  of 
controls  are  attractive  because  the  vector  of  q  standardized  work  variables 
converges  to  a  q-variate  normal  distribution  with  a  zero  mean  vector  and 
unit  dispersion  matrix: 


i:«  >  < 


bW1 


w*.i 
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4.2.4  Routing  Control  Variables 

The  moments  of  the  flow  variables  proposed  by  Lavenberg  et  al.  are  in 
general  unknown.  Hence,  we  were  not  able  to  standardize  these  variables.  We 
discarded  these  controls  as  candidates  in  favor  of  a  standardized  multinomial 
control.  We  call  these  new  controls,  “routing  variables”. 

All  routing  in  the  networks  we  considered  is  done  from  the  CPU.  Now 
define  an  indicator  variable  on  the  event  of  the  l  th  departure  from  the  CPU 
to  station  j 

1  if  the  l  th  departing  customer  goes  to  station  j 
0  otherwise 


UiU)  = 


Now,  from  the  discussion  of  the  simulated  networks  we  have  Pj{*)  as  the 
probabilty  of  transition  from  the  CPU  to  station  j.  If  N(t )  is  the  total 
number  of  transitions  from  the  CPU  up  to  time  t  then 

N(t) 

S  U  i  (j)  |  N(t)  =  n  ~~3(n;Pjn), 

(-1 


a  binomial  random  variable  on  n  trials  with  success  probability  p;(*).  We 
consider  standardized  controls  of  the  form 


"i!)  Vi(j)-p,n 
i-i  V*(i)(i  -p,(*))p,C) 
0 


for 


J  =  1,  '  '  '  ,  S  if  N(t )  >  0 
if  N{t)  =  0 


Since  E 


)  |  iV(i )  =0  for  all  n  and  for  all  t,  we  see  that  Xf{t)  has  mean 
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zero  not  merely  asymtoticallv  but  for  all  times  t: 


'[-vf(0 


=  0  for  all  t  >  0  ; 


moreover 


var 


xf(0|N(0  = 


n 


=  1  for  n  >  0 


lim 

t  —+oo 


var 


X? 


-  1 


15 

v 


& 


C\ 


I 


The  covariance  matrix  for  the  routing  variables  is  given  by  the  matrix 
lim  r 


v  _ 
-RR  ~  t 


cov[x?(t),X?(t))\N(t) 


,  where  j  and  k  reflect  possible 


routings  from  the  CPU.  The  covariance  matrix  is  calculated  as 


cov 


1 


-vpj(*)Pk(*) 


v(i  - P;(*m  -Pk(*)) 


when  j—k  ,  Nlt)~> 0 


when  j^k  ,  N{t  )>0 


when  N{t  )=0 


The  derivation  of  (4.2.4. l)  is  given  in  Appendix  4. 


(4.2.4. 1) 


Standardized  routing  variables  share  the  same  desirable  asymptotic 
properties  as  the  standardized  controls  proposed  by  Wilson  and  Pritsker 
(1984a,  b).  Let  XR(t )  =  Xf  (t),  ...,  XR{t)  .  In  Appendix  5  we  show  that 

XR(t )  P  •JV,(O.Zai),  (4. 2. 4. 2) 

n — *oc 

in  a  broad  class  of  queueing  networks  that  possess  the  regenerative  property. 
We  require  s—S— 1.  This  requirement  is  explained  in  Appendix  5. 

4.2.5  Selected  Performance  Measures 

To  assess  the  efficiency  of  using  the  “best”  subset  of  controls,  we 
concentrated  on  two  performance  measures:  (a)  coverage  and  (b)  volume 
reduction.  Coverage  was  computed,  across  a  metaexperiment,  as  the  actual 
proportion  of  generated  confidence  ellipsoids  that  contain  the  true  mean 
vector.  Volume  reduction  was  computed,  across  a  metaexperiment,  as  the 
average  percentage  reduction  in  the  volume  of  the  confidence  ellipsoid 
generated  with  the  “best”  subset  of  controls  relative  to  the  volume  of  the 
ellipsoid  generated  by  direct  simulation. 

For  experiment  l  ,  let 

Volume  of  the  confidence  ellipsoid 
V  i  =  '  generated  with  the  best  subset  of 
controls, 

V<  -■ 


Volume  of  the  confidence  ellipsoid 
generated  by  direct  simulation, 


p 


l 


1  if  the  controlled  confidence  ellipsoid 
contains  the  true  mean  vector 
0  otherwise, 


P? 


1  if  the  uncontrolled  confidence  ellipsoid 
contains  the  true  mean  vector 
0  otherwise, 


Across  the  metaexperiment  we  compute 


1  50. 

V  =  —  VV 


50 


/-l 


.  1  5°, 

Vd  =  —  W 


50 


l- 1 


d 

l 


-  >  I  50  „ 

P  -—vp 

50  i 


P  d 


50 


50  a 

VP 

l-l 


d 

l 


so  that  the  final  performance  measures  are 


1  - 


Volume  Reduction  {%)  = 


xlOO 


Coverage  Probabiltv(%) 
(Direct  Simulation) 

Coverage  Probabilty(9c) 
(Best  Subset  of  Controls) 


=  P  x 1 00 


=  P  xlOO 


When  Y{3)  is  used,  the  expression  used  for  volume  reduction  can  be 
obtained  from  equations  34  and  35  of  Rubinstein  and  Marcus  (1985). 
Specifically 


V 

1/2 

‘-Y  X 

Syy  1 

1/2 

p/2 


(K-l){K-Q-p) 


'p/2  p  (■,  \  ]p/2 

£ P,K-Q-p{ l~a) 

Fp,K-pll~a) 


and  .vhen  Y(o)  is  used,  we  have 


Vd  1/2  (K-l)(K-Q-p) 


]p/2  r  p  n_ lp/2 

^P<K-Q-p(l  a) 
Fp,K-p  (1~°0 


where  all  notation  is  as  appears  in  Section  2.4.  After  some  pilot 
experimentation  we  decided  to  employ  the  standardized  work  variables  of 
Wilson  md  Pritsker  (1984a,  b),  as  well  as  the  standardized  routing  variables 
proposed  above. 

4.3  Optimal  Subset  Selection  Methodology 

Within  each  basic  experiment,  a  set  number  of  replications  of  the 
simulation  model  were  performed.  Response  and  control  variable  data  were 
collected  for  each  replication.  After  the  data  had  been  collected  for  the  last 
replication  within  the  basic  experiment,  a  control  variable  selection 
methodology  was  applied  against  the  data.  This  procedure  computes  the 
multivariate  selection  criterion  for  all  possible  subsets  of  controls  and  finds 


the  minimum.  The  subset  corresponding  to  the  minimum  value  of  the 
selection  criterion  is  deemed  '“best”  or  ‘'optimal”.  Given  the  optimal  set  of 
controls,  the  coverage  and  confidence  region  volume  reduction  measures  were 
computed  and  tallied. 


The  selection  procedure  is  initiated  by  the  construction  of  a  grand 
covariance  matrix  that  includes  the  responses  and  the  controls.  Next,  a 
multivariate  generalization  of  a  binary  search  of  the  regression  “tree”,  as 
proposed  by  Furnival  and  Wilson  (1974),  is  employed  to  examine  all  possible 
subsets  of  controls.  Finally,  the  procedure  computes  the  performance 
statistics. 


.As  demonstrated  in  Sections  2.1  and  2.2,  the  control  variate  technique 
can  be  viewed  as  a  linear  regression  problem.  In  Section  2.1,  we  consider  the 
case  where  there  is  one  response  and  one  control.  In  this  case,  if  we  assume 
joint  normality  for  Y  and  X,  then  conditional  on  X— x,  we  have  the  classical 
regression  problem 


Y  =  Da  +  e  , 


(4.3.1.1) 


where  Y  =  (Y,,  •  ■  •  ,  Y K)  ,  a  =  [  ?  ,  D  is  as  in  equation  (2.1.17),  K  is  the 

number  of  replications,  e'  =  (e { ,  •  •  •  ,  eK)  a  vector  of  residuals  such  that 
e,  "  IID  ;V(0,  cr,2),  and  3  is  as  in  equation  (2.1.11). 


Under  the  multinormal  assumption,  the  least  squares  and  maximium 
likelihood  solution  for  a  is  given  bv 


a  =  (D'D)_1DT  . 


a 


We  note  that  the  only  remaining  unknown  in  the  model  is  the  variance  of  the 
residuals.  An  unbiased  estimator  is  given  by 


(Y  -  Dn  Y(y  -  Da 
K—2 


It  is  apparent  that  the  computational  overhead  necessary  to  completely 
estimate  the  model  of  equation  (-4.3. 1.1)  lies  mostly  in  the  inversion  of  D'D. 

When  we  extend  the  model  of  equation  (4.3. 1.1)  to  the  case  of  a 
univariate  response  with  multiple  controls,  as  well  as  to  the  case  of  multiple 
responses  with  multiple  controls,  we  see  that  little  changes  from  a 
computational  viewpoint.  In  particular,  for  the  case  of  a  univariate  response 
with  multiple  controls,  the  model  becomes 


Y  =  Da  4-  e  , 


(4. 3.1.2) 


where  Y  =  (Yj,  •  •  •  ,  Y K)  ,  a'  =  (//y,  Jx,  •  •  ■  ,  3q),  D  is  lescribed  by 
equation  (2.2.35),  K  is  the  number  of  replications,  e'  =  (elt  •  •  •  ,  tK)  is  a 
vector  of  residuals  such  that  e,  —  IID  :Y(0,  <7f2),  3i  is  as  in  equation  (2.2.33). 

Under  the  multinormal  assumption,  the  least-squares  and  maximum 
likelihood  estimates  for  a  and  cr,2  are  respectively  given  by 


a  =  (D'D)_1DT 


(y  -  Payer  -  Da) 

K—O  —1 
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Seber  (1977).  Finally,  in  the  case  of  multiple  responses  and  multiple  controls, 


the  model  is 


Y  =  DA  +  E  , 


(4. 3.1. 3) 


where  Y  =  (Y^,  ■  •  ■  ,  Y'p')  ,  and  represents  K  independent 

observations  on  variable  j  ■  Here  A  =  (a^,  ■  ■  •  ,  a^)  and 
at;)  =  i|j),  •  •  •  ,  J^y,  so  that  A  is  the  matrix  of  regression  (control) 

coefficients.  Moreover,  D  is  as  in  equation  (2.1.35),  K  is  the  number  of 
replications,  and  E  =  (e^,  •  •  •  ,e^)  where  is  the  column  vector  of  K 
residuals  for  the  jth  variable  so  that  each  row  of  E  is  ~  IID  N.  (0,  £)  By 
similiar  arguments  (Seber  (1984)),  we  estimate 


A  =  (D'D)-1D'Y  . 


(4.3. 1.4) 


(Y  -  DAl'fY  -  DA' 


K-Q-l 


For  both  the  models  given  in  equations  (4.3. 1.2)  and  (4.3. 1.3),  we  can  see  that 
the  bulk  of  the  computation  involved  in  estimating  model  parameters  lies  in 
the  inversion  of  D'D.  In  the  next  section  we  discuss  efficient  Gaussian 
elimination  methods  to  accomplish  this  end. 


v*'/v  V 


•  >  >  L  \  J*  * 


4.3.1  Matrix  Methods 


In  this  section  we  discuss  methods  used  to  invert  matrices  of  the  form 
D'D  ,  where  D  is  KxQ  and  of  column  rank  Q.  These  methods  are  based  on 
elementary  row  operations.  Such  methods  are  called  Gaussian  elimination 
methods.  We  will  discuss  the  Gauss-Jordan  method,  the  sweep  operator, 
symmetric  sweeping,  and  Gaussian  elimination. 

Following  Kennedy  and  Gentle  (1980),  we  enumerate  the  following 
elementary  row  operations  on  a  matrix: 

1.  Interchange  two  rows. 

2.  Multiply  any  row  by  a  constant. 

3.  Add  one  row  to  another.  To  effect  an  elementary  row  operation,  one 
can  perform  the  operation  on  the  identity  matrix  and  simply  premultiply 
this  new  matrix  on  a  target  matrix.  These  altered  identity  matrices  are 
often  called  elementary  matrices.  Methods  based  on  these  elementary 
transformations  are  called  Gaussian  elimination  methods. 

Chvatal  (1983)  presents  a  clear  exposition  of  Gaussian  elimination 
methods.  He  condenses  the  elementary  operations  needed  to  invert  a  matrix 
into  two  basic  matrices  (a)  the  permutation  matrix,  and  (b)  the  so-called  eta 
matrix.  The  permutation  matrix  is  an  elementary  matrix  that  simply 
interchanges  two  rows.  The  eta  matrix  is  a  succession  of  elementary  matrices 
that  zero  selected  elements  of  a  matrix. 

The  Gauss-Jordan  method  of  matrix  inversion  is  based  on  the  following 
observation.  Let  the  matrix  T  be  the  product  of  successive  multiplications  of 


A--'- 


'''-‘-“."A  •Mv'vV'. 


those  elementary  matrices  which  must  be  applied  to  a  square  matrix  D'D 
(where  D  is  K  <0  and  of  column  rank  0)  in  order  to  reduce  it  to  the  identity 
matrix.  That  is,  if  E;  is  the  yk  elementary  matrix,  then,  if 

T  =  Ep ....  E, 


we  have 


T(D'D)  =  lQ 


and 


T(D'D)(D'D)_1  =  Iq(D'D)_i 


or 


TIq  =  (D'D)"1 


Now  to  implement  this  method,  we  merely  augment  D'D  with  I q  and 
apply  T  to  both.  In  the  notation  of  Kennedy  and  Gentle  (1980), 

[(D'D)  1 1^  ]  -p  \\q  |  (D'D)-1] 


Beaton  (196-1)  exploited  the  fact  that  for  each  column  reduction  of  D'D 
all  the  columns  of  the  identity  matrix  remain  in  the  transformed  (augmented) 
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matrix.  Essentially,  he  simply  replaces  the  newly  created  identity  column 
with  its  counterpart  in  the  right  hand  side  of  the  augmented  matrix.  This 
storage  innovation  allows  for  the  inversion  of  a  matrix  in  its  own  space.  The 
operations  which  accomplish  this  are  called  sweeps.  Beaton  (see  Seber  (1977), 
pp.  351)  offers  the  sweep  operators.  A  matrix  G  is  said  to  be  swept  on  the 
kth  row  and  column  if  it  has  been  transformed  into  a  matrix  G  =  such 

that 


Qkk  ~ 


1 

Qkk 


'  _  Qtk  (■!].' 

Qtk  —  vl  , 

Qkk 


at,  =  —  (;'#*) 

Qkk 


'  QikQkj  ,  .  .  /  t  \ 

Qij^Qxj-  - 

Qkk 


Schatzoff  et  al.  (1968),  Seber  (1977),  and  Kennedy  and  Gentle  (1980) 
discuss  the  properties  of  the  sweep  operator  in  varying  detail.  A  useful 
result,  given  in  Kennedy  and  Gentle  is  the  following.  Let  G  be  the 
augmented  matrix 


D'D  D'Y 
Y'D  Y'Y 


(4.3. 1.5) 


where  D  is  KxQ  (with  column  rank  O)  and  Y  is  K  <p.  Now,  if  G  is  swept  on 

* 

the  first  Q  rows  and  columns,  then  the  resultant  matrix  G  is 


G 


(4. 3. 1.6) 


(D'D)-1  (D'D)_1D'Y 

— Y'D(D'D)_1  Y'Y  -  Y'D(D'D)_1D'Y 

In  particular,  if  D  and  Y  are  “centered”  (that  is,  if  the  sample  mean  of  each 
variable  being  subtracted  from  each  obsevation),  then 


G' 


(D'D)-1  A' 

-A  RSS  ' 


where  A  is  as  given  in  equation  (4.3. 1.4)  and  RSS  is  the  matrix  of  residual 
sums  and  cross  products. 


The  matrix  G,  is  symmetric.  The  symmetry  of  G  can  be  exploited  by 
working  only  on  the  upper  triangle.  This  method  is  called  the  symmetric 
sweep.  It  is  due  to  Steifel  (1963)  and  for  each  pivot  on  matrix  G,  we  get  the 

J- 

matrix  G  such  that 


Okk  — 


-1 

Okk 


Oik  =  Ola  =  OikSkk  (i^k) 
9ij  ~  Op  =  Oij  +  OikOkj 


Sweep  and  symmetric  sweep  operators  produce  both  the  regression  (or 
control)  coefficients  and  the  RSS  matrix.  For  some  applications,  such  as  our 
problem,  there  is  no  need  to  calculate  all  the  control  coefficients  for  every 
sweep.  In  these  app'^ations,  the  primary  interest  is  the  RSS  matrix.  The 
desired  result  can  oe  obtained  by  applying  Gaussian  elimination  to  the 
matrix  G,  given  in  equation  (4.3. 1.5).  Seber  (1977,  page  304)  shows  the 
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desired  result  for  the  case  when  Y  is  lxA'.  The  result  for  a  Kxp  matrix  Y  is 
suggested  by  Kennedy  and  Gentle  (1980,  equation  (7.21))  and  is  reproduced 
here  as  equation  (4.3. 1.6).  To  prove  that  Gaussian  elimination  produces  the 
appropriate  RSS  matrix,  one  premultiplys  the  matrix  G  of  (4. 3. 1.5)  by  the 

matrix 

% 

T  0 

— Y'D(D'D)-1  Ip 

where  T  is  the  lower  triangular,  nonsingular  matrix  that  reduces  D'D  to  unit 
lower  diagonal.  If  D'D  is  nonsingular  then  the  existence  of  T  is  assured,  see 
Seber  (1977,  Chapter  11). 

4.3.2.  Generation  of  All  Possible  Regressions 

In  order  to  implement  the  criteria  obtained  in  Chapter  3,  a  methodology 
is  required  to  generate  the  needed  regression  information  from  each  subset  of 
regressors.  Several  systematic  procedures  have  been  suggested.  Garside 
(1971),  Schatzoff  et  al.  (1968),  and  Furnival  (1971)  offer  methods  based  on  a 
binary  coding  for  each  subset  of  regressors.  Furnival  and  Wilson  (1974)  offer 
algorithms  which  produce  all  possible  regressions  in  orders  they  call  natural, 
lexicographic,  and  familial.  These  orderings  are  based  on  the  systematic 
search  of  a  structure  called  a  regression  tree. 

As  we  have  seen  in  the  previous  section,  Gaussian  elimination  and/or 
sweeps  (henceforth,  we  refer  to  individual  eliminations  as  “pivots”)  can 
produce  regression  information  for  some  subset  of  the  regressors  (controls). 
We  need  to  generate  a  sequence  of  pivots  on  carefully  stored  matrices,  such 
that  every  possible  subset  of  controls  is  considered.  Following  Furnival  and 


I 
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Wilson  (1974),  we  form  a  regression  tree  as  follows  (see  Figure  3).  At  the 
root  of  the  tree  is  the  original  covariance  matrix.  At  this  point  no  variables 
(regressors)  have  been  allowed  to  pivot  into  the  model.  The  notation  123 
signifies  that  there  are  3  candidate  controls.  Dark  lines  on  the  tree  signify 
pivots.  For  instance,  the  dark  line  emanating  from  the  root  signifies  a  pivot 
on  variable  1.  Note  that  the  resultant  notation,  after  this  pivot,  is  23.1. 
Integers  after  the  dot  signify  those  variables  included  in  the  model;  the 
variables  appearing  before  the  dot  are  not  yet  in  the  model.  Dotted  lines 
represent  the  deletion  of  a  variable  from  the  model.  For  instance,  the  dotted 
line  emanating  from  the  root  signifies  the  deletion  of  variable  1.  Note  that 
this  deletion  results  in  the  model  23.  At  each  node,  we  either  pivot  a  variable 
into  the  model  or  delete  a  variable  from  the  model.  These  pivots  and 
deletions  are  carried  out  until  all  possible  subsets  of  models  have  been 
considered.  Furnival  and  Wilson  point  out  that  this  tree  can  be  traversed  in 
any  “biologically”  feasible  order.  That  is,  we  require  only  that  a  father  be 
born  before  a  son.  If  we  search  the  tree  horizontally  from  level  to  level,  this 
is  the  so-called  natural  order.  Storage  savings  are  possible  if  if  we  use  the  so 
called  lexicographic  (dictionary-like)  order.  Furnival  and  Wilson  also  offer  a 

(a)  binary  ordering  which  amounts  to  a  counting  process  in  base  two,  and  a 

(b) familial  ordering  in  which  both  horizontal  and  vertical  elements  are 
combined.  Table  4.4  shows  the  order  of  the  regressions  produced  by  the  four 
methods  addressed  above.  Furnival  and  Wilson  offer  algorithms  for  each 
ordering. 


Table  4.4  Sequences  of  Regressions 


latural 

Lexicographic 

1 

1 

1 

1 

2 

12 

2 

2 

3 

123 

12 

3 

4 

1234 

3 

4 

12 

124 

13 

12 

13 

13 

23 

13 
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123 

23 

23 

14 
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123 
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14 
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24 

24 
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24 
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24 
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134 
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34 

234 

234 

1234 

4 

1234 
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CHAPTER  5 


EXPERIMENTAL  RESULTS 


In  this  chapter  we  discuss  the  experimental  portion  of  the  research.  We 
summarize  the  results  of  our  simulation  experiments  and  conclude  the 
chapter  by  discussing  a  few  experimental  excursions  designed  to  examine  the 
underlying  assumptions  of  our  analysis. 


;  5.1  Summary  of  Experimental  Results 

1  As  mentioned  in  Section  4.1,  we  experimented  with  two  different  types  of 

t 

|  closed  queueing  networks.  For  purposes  of  discussion,  we  refer  to  the  first 

;  type  of  system  (no  blocking  at  the  CPU)  as  a  type  I  network  and  the  second 

’  type  (blocking  at  the  CPU)  as  a  type  II  network.  Models  1  and  2  (models  4 

I  and  5  of  Lavenberg,  Moeller,  and  Welch  (1978))  were  type  I  models,  while 

models  3  and  4  (models  15  and  16  of  Lavenberg  et  al.)  were  type  II  models. 

■  All  models  were  discrete-event  simulations  written  in  SLAM  (Pritsker  1986) 

» 

l 

I  and  a  FORTRAN  listing  of  both  types  is  given  in  Appendix  5. 

i 

t 

One  basic  aim  of  this  research  is  to  provide  a  simulation  practitioner 

{ 

\  with  a  methodology  whereby  he  can  select  the  “best”  subset  of  controls  to 


} 


use  to  in  constructing  a  confidence  region  for  the  steady-state  mean  vector  of 
responses.  The  analysis  program  we  developed  to  accomplish  this  end  is 
given  in  Appendix  7. 

For  type  I  networks,  it  is  possible  to  obtain  analytically  ,the  steady-state 
expected  values  of  the  response  vector  R(£),  UCpu(0|  •  We  used  the 

software  package  CAN-Q  (Solberg  (1980))  to  calculate  these  values.  Since  it 
is  impossible  to  run  our  simulations  to  infinity,  we  settled  for  long  run 
lengths.  For  type  I  models,  we  chose  a  run  length  of  20,000  time  units.  For 
type  II  models,  we  chose  a  run  length  of  30,000  time  units.  We  started  the 
collection  of  statistics  after  2,000  time  units  in  an  effort  to  minimize  the 
effects  of  initialization  bias.  We  chose  the  longer  run  length  for  type  II 
models  because  of  the  presence  of  blocking  at  the  CPU.  When  steady-state 
expected  values  (/iy)  were  not  available,  we  used  the  grand  mean  vector  of 
the  2,000  replications  (y(2,000))  as  the  population  mean.  We  report  our 
results  relative  to  both  nY  and  (F(2,000)).  Table  5.1  summarizes  this 
information  for  all  four  models. 
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Table  5.1  Mean  Responses  for  the  Queueing  Systems 
Used  in  the  Experimental  Evaluation 

Steady-State  Mean 

- 1 

Y  (2,000) 

Model 

R(0 

Ucpu(*  ) 

R(0 

ucpu(0 

1 

36.13 

.918 

36.04 

.9177 

2 

81.71 

.413 

81.14 

.4128 

3 

* 

* 

247.06 

.3590 

4 

* 

* 

85.92 

.6625 

The  candidate  controls  chosen  for  the  type  I  networks  were  the  four 
standardized  work  variables  (collected  at  all  stations)  and  the  three 
standardized  routing  variables.  Note  that  the  analysis  program  includes  a 
tolerance  check  on  incoming  variables.  This  check  precludes  multicollinearity 
problems  which  would  result  if  all  three  routing  variables  were  included  in 
the  model.  We  chose  only  seven  of  the  available  control  variables  as 
candidates  for  the  type  II  models.  This  was  done  to  keep  the  analysis  at  a 
comparable  dimensionality  across  both  types  of  networks.  We  chose  as 
candidates  the  standardized  work  variables  for  the  CPU  and  the  two  busiest 
disk  drives;  also  we  used  four  standardized  routing  variables  (excluding  only 
routing  to  less  frequented  disk  drives). 


15 

For  each  model  we  ran  2,000  independent  replications.  The  first  1000 
replications  were  used  to  yield  50  meta-experiments  of  replication  size  20.  All 
2,000  replications  were  used  to  produce  50  metaexperiments  of  replication 
size  40.  We  report  (with  respect  to  both  jJ.Y  and  T(2,000)  )  estimated 
coverage  probabilities  and  estimated  volume  reductions  for  both  estimators 
{Y(3)  and  Y[/'t)).  Nominal  coverage  was  90%.  Tables  5.2  and  5.3 
summarize  our  experiments. 


Table  5.2  Performance  of  the  Controlled  Point 
and  Confidence  Region  Estimators  for  K=20 
Replications  of  the  Selected  Queueing  Systems 

Coverage  Probability  ( 

%) 

Volume 

Steady-state  mean 

Y  (2,000) 

Reduction  (%) 

Model 

_  ...  . 

Y(3)  ?6) 

Y(3) 

Y(3) 

Y(3) 

Y{l ) 

1 

78  80 

86 

86 

73 

45 

2 

28  64 

78 

80 

83 

52 

3 

★  * 

82 

90 

61 

43 

4 

★  * 

83 

84 

46 

34 

« j"..  r ■ 


■  »*- ‘WVv 


Table  5.3  Performance  of  the  Controlled  Point 
and  Confidence  Region  Estimators  for  K=40 
Replications  of  the  Selected  Queueing  Systems 


Model 


Coverage  Probability  {%) 


Volume 


Steady-state  mean  y(2,000)  Reduction  (%) 


Y(3 )  Yfr)  Y(3)  rtf)  Y(3)  Y(:;) 


84 

90 

63 

53 

88 

86 

47 

41 

We  observe  that  in  all  cases  Y[^)  covered  the  steady-state  mean  better  than 

_  A  _  A 

Y{3).  Further,  Y (7 )  offered  comparable  if  not  superior,  coverage  relative  to 

_  A  _ 

Y(3)  when  T(2,000)  is  taken  as  the  target  mean.  This  improvement  in 
reliability  is  probably  due  to  the  conservatism  of  Y (7 )  to  Y(3)  as  reflected  in 
the  realized  volume  reductions. 
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5.2  Examination  of  the  Assumptions  Underlying  the  Application  of 


Control  Variables 


Two  major  assumptions  come  into  play  as  one  applies  the  estimators 
Y(3)  and  Y"(7).  Confidence  interval  procedures  for  both  estimators  are  based 
on  the  assumption  of  joint  normality  between  the  responses  and  the  controls. 
The  confidence  interval  procedure  based  on  Y (7 )  assumes  that  the  mean 
vector  and  dispersion  matrix  of  controls  is  known.  In  the  case  of  the  controls 


we  applied,  these  quantities  are  known  only  asymptotically.  We  also  assumed 
that  the  runs  were  of  sufficient  length  that  the  response  vector  could  be 
assumed  to  be  in  steady-state. 

We  carried  out  three  excursions  from  the  primary  analysis  to  gauge  the 
effects  of  the  underlying  assumptions.  We  hoped  these  experiments  would 
shed  some  light  on  the  sensitivity  of  the  procedures  to  the  underlying 
assumptions.  First,  we  looked  at  the  situation  where  the  responses  and 
controls  were  distributed  as  jointly  normal  random  variables  with  all  means 
and  covariances  known  exactly.  We  calculated  the  sample  covariance  matrix 
of  responses  and  controls  for  2,000  replications  of  model  2  (chosen 
arbitrarily).  We  took  this  covariance  matrix  to  be  the  population  covariance 
matrix  of  responses  and  controls.  Next,  we  generated  2,000  independent, 
normally  distributed  random  vectors  based  on  this  structure.  We  repeated 
our  basic  analysis  for  replication  sizes  of  20  and  40.  In  this  experiment  the 
means  of  the  responses  as  well  as  the  controls  are  known  exactly.  The 
results  are  Table  5.4 


Table  5.4  Performance  of  the  Controlled  Point  and  Confidence 
Region  Estimators  when  Multivariate  Normality  is  Ensured 


Coverage 
Probability  (%) 


Volume 
Reduction  {%) 


Replications 

Y(3) 

rfr) 

Y0) 

Y(l) 

20 

89 

93 

81 

50 

40 

92 

94 

85 

69 

We  observe  that  under  ideal  conditions  both  estimators  deliver  nominal 

coverage  and  Y(~/)  is  more  conservative.  We  feel  that  the  results  above 
_  /»• 

validate  Y'('y)  as  a  viable  estimator. 

Next,  we  wanted  to  see  if  we  could  determine  whether  it  was  the  lack  of 
normality  in  the  responses  or  insufficient  run  lengths  which  degraded 
estimator  perfromance  in  the  actual  simulations.  To  accomplish  this  end,  we 
applied  normalizing  transformations  (Box,  Hunter,  and  Hunter  (1978),  and 
(Anderson  and  Mclean  (1974))  in  an  effort  to  make  the  data  appear  more 
normally  distributed.  We  took  the  natural  logarithm  of  the  response  times 
and  applied  the  transformation  z  =  arcsin  (\AJcpu(0)  CPU 

utilizations.  The  results  are  Table  5.5.  Coverage  is  relative  to  the  grand 
average  of  the  2,000  transformed  response  vectors  (Yz(2,000))  . 


’•  -SW/vV/V 


9 


11 


Table  5.5  Performance  of  the  Controlled  Point  and  Confidence 
Region  Estimators  Under  Normalizing  Transformations 
of  Queueing  Simulation  Responses 

Coverage 

Probability  {%) 

Yz{ 2,000) 

Volume 

Reduction  {%) 

Replications 

Y(0)  Y(-i ) 

Y0)  f(i) 

20 

40 

84  85 

74  90 

82  51 

86  70 

Comparing  Tables  5.3  and  5.4  to  Table  5.5,  we  see  that  there  seems  to  be  an 

_  A  _  A 

indication  that  T("y)  is  less  sensitive  to  departures  from  normality  than  Y(f3). 

Finally,  we  were  interested  in  the  effects  of  run  length.  We  contrast  the 
performance  measures  at  run  lengths  of  5,000  and  20,000  time  units, 
respectively.  The  results  are  Table  5.6.  The  replication  level  is  20. 


PS 

•4 


PS: 
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Table  5.6  Performance  of  the  Controlled  Point  and  Confidence 
Region  Estimators  for  Queueing  Simulations  of  Different  Run  Lengths 

Coverage  Probability  {%) 

Steady-state  mean  V(2,000) 

Volume 

Reduction  (%) 

Run  Length 

Y(0)  Yh) 

Y(3)  Yfy 

Y(3)  ?(-,) 

5000 

30000 

18  33 

28  64 

90  92 

78  80 

57  39 

83  52 

We  observe  that  the  increase  in  run  length  yields  improvements  in  the 
coverage  of  the  steady-state  mean  vector  for  both  estimators.  We  also 
observe  a  degradation  in  coverage  about  T(2,000)  when  the  run  length  is 
increased.  Volume  reductions  appear  to  be  significantly  larger  for  the  long 


runs. 


CHAPTER  6 


CONCLUSIONS  AND  RECOMMENDATIONS 

0.1  Overview 

This  research  offers  a  solution  to  the  general  problem  of  optimal 
selection  of  control  variates.  We  offer  solutions  for  two  different  cases  of  the 
general  problem:  (a)  when  the  covariance  matrix  of  the  controls  is  unknown, 
and  (b)  when  the  covariance  matrix  of  the  controls  is  known  and  is 
incorporated  into  point  and  confidence  region  estimators.  For  the  second 
case  we  introduce  a  new  estimator  that  we  represent  by  the  symbol  Y’('y). 
Under  the  assumption  that  the  responses  and  the  controls  are  jointly  normal, 
we  have  established  the  unbiasness  of  this  new  estimator,  and  we  have 
derived  its  dispersion  matrix.  We  have  implemented  a  selection  algorithm 
which  locates  the  optimal  subset  of  controls.  The  algorithm  is  based  on 
criteria  we  derive  for  the  two  cases  listed  above.  We  have  introduced  a  new 
class  of  controls  which  we  call  “routing  variables”.  We  derived  the 
asymptotic  distribution  of  these  controls  as  well  as  their  asymptotic  mean 
and  variance.  Finally,  we  have  investigated  the  performance  of  the  selection 
algorithm  and  we  have  contrasted  the  estimators  Y{3)  and  Y(i). 
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8.2  Conclusions 

We  conclude  that  the  selection  algorithm  delivers  a  workable  set  of 
controls  that  yield  minimum  expected  squared  volume.  Under  ideal 
conditions,  confidence  region  procedures  based  on  both  estimators  deliver 
nominal  coverage  and  significant  volume  reduction.  The  estimator  Y ( '• ) 
appears  to  be  more  conservative.  The  conservatism  of  Y(~<)  greatly  enhances 
its  reliability  when  the  underlying  assumptions  are  violated. 

Routing  variables  prove  to  be  a  significant  new  class  of  controls 
variables,  in  that  they  entered  every  model  fitted  during  the  course  of  the 
analysis.  They  are  easy  to  implement  and  should  be  considered  as  candidate 
controls  in  any  simulation  that  contains  probabilistic  branching. 

0.3  Recommendations 

There  are  several  avenues  of  potential  future  research  that  present 
themselves. 

1.  In  the  case  where  the  covariance  matrix  of  the  controls  is  known,  we 

_  /s 

offer  an  unbiased  estimate  of  the  dispersion  matrix  of  Y(i)  which  we 
call  £.  The  theoretical  properties  of  £  have  yet  to  be  established.  It 
would  also  be  of  interest  to  investigate  other  estimators  of  E. 

2.  We  employed  a  “all-regressions”  approach  to  find  the  optimal  subset  of 
controls.  Generalization  of  a  search  scheme  that  avoids  total 
enumeration  of  the  subsets  would  be  useful.  A  multivariate 
generalization  of  the  branch  and  bound  algorithm  suggested  by  Furnival 
and  Wilson  (1974)  is  immediately  suggested. 
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3.  Dr.  Bruce  Schmeiser  has  suggested  that  the  response  variables  be  scaled 
prior  to  analysis  to  reflect  the  preferences  of  whoever  uses  the  model  to 
make  a  decision.  Research  aimed  at  incorporating  the  preference 
structure  and  risk  adversion  of  the  decision  maker  would  be  of  interest. 
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Appendix  1:  Derivation  of  Equation  (3.2.1.10) 


Now,  define  Z  =  vec  X  =  vec  (A'(,  •  •  •  , A^ )  so  that  Z  is  the  (KQ) 
dimensional  column  vector  formed  by  stacking  the  K  X  one  upon  another. 


Now,  the  condition  j  A”;  =  i  can  be  more  compactly  expressed  as  Z  = 


Also,  define  Hz  ~  vec  (1  k^^x)-  Write 


Y(--)  =  Yi.  ■  •  ,yK  H  =  YH 


(A.  1.1' 


vec  Y(-  )  =  vec  (YH)  =  (H'XM  vec  Y  . 


(A.1.2) 


covf  vec  YH  =  £r  cov  vec  YH  !  Z  =  z  4-  cov  E  vec  YH  Z  =  z 


(A.1.3) 


Examining  the  first  term  of  the  right  hand  side  of  equation  (A.1.3) 

Ex  cov  vec  YH  |  Z  =  zj]  =  Ex  cov  (H't?/p )  vec  Y|Z  =  zll 


=  Ex  ^(H'®/p)cov  ^  vec  Y|  Z  =  zj(H'X/p)'j  (A.  1.5) 

=  Ex  (H'0/p)(/Ic^>Ly  ix)(H'%Jp)'  (A. 1-6) 

=  Ex[(H'®Ey:x)(HX/p)]  (A. 1.7) 

=  Ex  (H'H&£yj_y)  =  Sy  i XEX  H'H  (A. 1.8) 


Now 

H'H  =  K  1  +  [K— 1)  *(X— Mx)/Sxx'SjCySjor(X—  /7x)  (A. 1.9) 


E  H'H  —  K  1  +  E  tr(K— 1)  l(X—f2x)'ExxSxx'Exx(X.—fix)  (A.  1.10) 

by  the  properties  of  the  trace  (tr).  Now 

E  tr(K— 1)  !(X— ^x)'^xx^xx^xx0^~ Mx)  =  (A. 1.11) 
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(K-iyhr  [e[(X-^x)'^S^^(X-^x)JJ  =  (A. 1.12) 


{K— 1)  Hr  E  (X— Mx)(X— ExY^xx^  XX^XX  ~  (A.1.13) 


(AT— 1)  Hr  E  (X-Mx)(X-/ix)'  ^xxE  Sxx  ^ xx  (A.1.14) 


(K(K— 1))  Hr  ^xx^xx^xx^xx  — 


(K(K~l))-hr  \lQ  =  (AT(A:-l)r1(3 

. 


so  algebra  shows  that  (A.1.8)  (hence  (A.1.4))  becomes 


\XEX 


[h'h  = 


K+Q-l 


K{K- 1) 


(A.1.15) 


(A.1.16) 


(A.1.17) 


Next,  we  examine  the  second  term  of  the  right  hand  side  of  (A.1.3) 

cov  E  vec  YH  |  Z  =  zjj  =  cov  \e  |"(H'<8>/p)  vec  Y|  Z  =  all  = 


(A.1.18) 


Equation  (3.2.18)  allows 


E(XHH'X')  -  £(XH)£(XH)'  . 
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First,  we  examine  the  first  term  on  the  right  hand  side  of  equation  (A.  1.25). 
Now 

(HH')  =  K  “1K1K'  —  K  ^^(X— fix)'^xxG'iK— 1)  1  — 

—  (K— 1)  Vx)K  11k'  +  {K— 1)  2G—xs:(X— Mx)(X—  Px)'^xx^' 


We  note  that  (G  is  as  given  in  equation  (3.2.1.45) 


XG  =  (X  -  XL*')G  =  {K-^Sjoc 


(A.1.26) 


(A.1.27) 


Hence 


X(HH')X'  -  K'2XlKlKfXl  -  X(X-/iX)'^5xy' 


/v-i 


—  5xx'^xx'(X—  Mx)x  4-  SxxY.Xx(X~-fix)(X—^x)''^xxSxx' 

(A.1.28) 


Looking  at  the  first  term  in  the  right  hand  side  of  equation  (A.1.28)  we  have 
E{K~2^1k1k^)  =  A"2£(XX?)  =  K~lZxx  +  HxHx'  (A.1.29) 


Examination  of  the  second  term  in  the  right  hand  side  of  (A.1.28)  reveals 


E  [x(x— MxySjar-W  ]  =  E  [x(x-/iXy 


(A.1.30) 


E  XX?  —  Xfix'  -xx-xx  = 


(A. 1.31) 


E 


XX  —  Hx^X 


K~lZ 


XX 


(A.1.33) 


Now  we  examine  the  fourth  term  on  the  right  hand  side  of  (A.  1.28),  we 
observe  as  a  consequence  of  equation  (3.2. 1.9)  that 


Sxx  ^  jar(x— Mx)  (X— ) '  ^xx  Sxx ' 


=  cov 


Sxx^xxCX-— Mx) 


(A.1.34) 


E  [cov  Sxx-Zjq:(X—  Mx)  i  =  SM  ]]  +  cov  ^  Sxx’^xx0^~^x)  |  Sxx  —  SXX  ]]  — 


(A.1.35) 


cov 


Sxx^xxOt—Hx)  |  Sxx  —  sxx  ]]  —  E  [■?xx'^'ja:cov  [xl  sxx  — 


^xxsxx 


(A.1.36) 


Sxx^xxK  l^xx^xx^xx 


=  E 


K  1  Sxx^xx^xx 


(A.1.37) 


Let  U)  ~  Wq  (K— 1  ,  In),  where  Wq(K—  1  ,  In)  is  a  random  matrix 


r 


distributed  as  Wishart  with  K— 1  degrees  of  freedom  and  expected  matrix  Iq  . 
Now,  Sxx  Wq  (K  — 1,K  — 1  xx )  >  so  $xx  ~~  {K— 1)  l—xx^— y^t  ■  -^ow 
continuing  from  equation  (A.1.37) 


K~l  Sy-Tr~- 


J  XX -XX  J  XX 


=  E 


K  l(K— 1)  2{^^W^d^)Ey^(Eydlw'L^) 


(A.1.38) 


K-\K-1)-2Z^V)WZ^  =  K_1(K-ir2E^2E 


WW 


(A.1.39) 


Now  we  compute  E 


WW 


consider  the  E{Q  ],,  when  i 


with  the  following  arguments.  Let  Q  =  WW,  first 
=  J- 


E 

Qa 

=  E 

ZWlvWlv 

v-1 

(A.1.40) 


Q 


v-l 


wd 


(A.1.41) 


Using  equation  (4),  page  90  of  Muirhead  (1982)  (and  some  algebra),  we  get 


Q 

-  Yjvar 


w<x 


+  E 


w,i 


—  (K  — 1)(K  +Q  )  . 


it  - 1 


(A. 1.42) 


Now,  consider  when  i 


E\Qtt]  =  E  £WivWiv  = 

L  J  [u-l 


(A.1.43) 


E  [^ij  Wji  ]  +  E  [l^ij  Wy} 


+  S  E  WiyW)V  = 


(A.1.44) 


Once  again  we  apply  equation  (4),  page  90  of  Muirhead  (1982)  (and  some 
algebra)  we  get 

r  l  r  I  Q  f 

cov  Wn  U’jj  +  cov  WtjW}]  +  V;  cov  WlvWiV  =0. 

J  L  J  V=1 


(A.1.45) 


So,  we  have 


E[q\  =  (K-1)(K+Q)Iq 


(A.1.46) 


and  so  equation  (A.1.34)  (the  fourth  term  on  the  right  hand  side  of  equation 
(A. 1.28)  becomes 


(A-lj 


(A.1.47) 


We  now  have  all  the  pieces  of  equation  (A.1.28)  in  equations  (A. 1.29), 
(A. 1.33),  and  (A.1.47).  Returning  to  equation  (A.1.24),  we  see  that  we  need 
£(XH).  Now  equation  (3.2. 1.9)  implies 

' 

E(XR)  =  jix  —  E  Sxx^xxl'X—fix)  =  Mx  (A.  1.48) 


Now  putting  equation  (A.1.28)  together  yields 


cov(XH)  —  K  l^xx  ~  l^xx  +  .  . 2  Sxx  (A.1.49) 

{K-iy 


[K-l)K 


(A.1.50) 


Now,  insertion  of  equation  (A.1.50)  into  equation  (A.1.22)  implies  equation 
(A.  1.1 8)  becomes 


cov  E  vec  YH  I  Z  =  z  =  ,  ^ 

L  L  1  JJ  K— 


(K-l)K 


^yx^xx-^xy  >  (A.1.51) 


so  equations  (A.1.51)  and  (A.1.17)  combine  in  equation  (A.1.3)  to  yield 
equation  (3.2.1.10). 
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Appendix  3:  Derivation  of  Equation  (3.2.1.17) 


c  ( r.p 


A.  -j-Q  —1 
K~l 
K~  2 
K-Q-2 


.  ci  = 


K-  2 
AT+Q-1 


(A. 3.1) 


Note  c  (r.p )  ,  c  !  <  1  ,  ?;•  .  p  >  0.  Write 


n c(r’P)  i~ci px 


n  i  -pr 

t-i 


(A.3.2) 


To  start  the  induction  let  i’  =  1  ,  then 


c(l,p)  1  -  c^2 

n  - - - - <  i  , 

1  -pi 


(A.3.3) 


implies 


(l.P)(l  -CxPt]  <  (l  - 


Px  >  fy  \P)  —  <  1  •  (A.3.4) 
c(l,p)ci  -  1 


Hence  conditions  exist  when  u  —  1.  Now,  when  v  =  n+1 


nc(»+i,P)(i-c^j  c(n+J,p)(1_Ci^ij 

h(n+l)  =  - — - ; - x - : - "I -  (A.3.5) 


n  | i  -  pt 

t  -i 


i  -  pi- 


1 


-  n{n)x- 


c(n+l,p)  1  -  cxp‘  +  x 

1  -  rLi 


A. 3. 6) 


.Assume  r;(n+l)  <  1  ,  then 


c(n  +  l,p)  1  -  cxpl  +  x 

n(n)  <  -  <  1  , 

1  -  PL  i 


(A. 3. 7) 


bv  an  argument  similar  to  that  leading  to  equation  (A. 3. 4).  Now  specify  Pn  +  \ 
as  o *  and  the  induction  follows. 


v ' .*  v  «■ ,  «**’■•*-  •*.*•'*.  ■*. **. _** 


Appendix  4:  Derivation  of  Equation  (4. 2. 4.1) 


)V(1  -P;('))P;(')(1  -pt(*))pfcr) 


^t)  .V(f) 

v  U  i  (j)  V  U  m(k) 


Pj(*)*v(0  ^  ^ 

m  =  1 


;  =  i 


p,:*)pt(*)^v(o2hV(o 


(A.4.3) 


Now 


.v(<) 

P>(*W)  Vf/m(A:)|iV(0 

m  —  1 


=  P;rK(*)A-(o2 


(A.4.4) 


So  insertion  of  equation  (A.4.4)  into  the  above  yields 


cov 


X?,X?\N(t) 


£ 

•v«)  iV(t) 

-PjnPknN(t)2\N(t) 

. 

i  *1  m  — 1 

*(t  )V(i  -p;r))p;r)(i  -Pkn)Pd*) 


(A.4.5) 


NU)  N{t) 

Let  Y}  =  V  U  [  ( j )  and  Yk  =  V  Um(k )  .  Now  F;  and  are  the  marginals 

/  —  1  m  —  1 

of  a  multinomial  distribution.  Hence 


’.v(0  <v(t) 

ZUiU)  z,um(h)\x(t) 

=  E 

y;r,|A'(<) 

l  —  1  m- 1 

1 


(A. 4. 6) 


(A.4.7) 


=  cov 


r;>ri!.v(OJ  +  i?[rJ|.v(0 


-\'(0 


=  -X(t)PjnPk(*)  +  *V(02P,(*)P.(*)  •  (A.4.8) 


So,  equation  (A.4.5)  becomes  (after  some  algebra), 


cov 


X*,X?\X{t) 


_ 1 _ 

V(t )V(!  -  PjH)p;H(i  -  Pki*))Pki*) 


-N(t)P}(*)Pk(*) 


(A.4.9) 


_ -p,-(*)Pfc(*) _ 

V(1  -  P;(*))P;(*)(1  -  Pki*])Pk{*) 


(A.4.10) 


P; 

(*)p*n 

.  (J  “  P; ( 

*))(!-  Pfc(*)). 

1/2 


(A.4.11) 


rC. 


a  x  _ 


Appendix  5:  Proof  of  Relation  (4. 2. 4. 2) 


In  this  appendix  we  establish  that 


xR  ~r—x,( 

n  — >\x 


(A.5.1) 


where  XR  =  Xf ,  •  ■  •  ,XR  .  Here  there  are  1,  •  •  •  ,5  stations  with  positive 


probability  of  being  branched  to  from  the  CPU.  We  closely  follow  a  similar 


proof  offered  by  Wilson  and  Pritsker  (1984a,  appendix  A). 


Let  g(t)  be  the  greatest  integer  in  akt  for  t  >0.  Here 


P  k(*) 


E\U 


with  probab”  «y  1 


(A.5.2) 


where  k  is  the  station  branched  to,  pCPU  is  the  utilization  of  the  CPU,  and 
Uc?v  is  the  service  time  distribution  of  the  CPU.  Now  express  the 
standardized  routing  variable  XR  (t )  in  terms  of  the  partial  sum  process 


n  U A k)  —  uv 

Sn( k)=  S  J  -  «>1. 

;-l  ak 


(A.5.3) 


Wilson  and  Pritsker’s  proof  is  based  on  use  of  the  "dissection''  formula 


^k(0-^k(0  +  ^k(t)  +  «k(0 


(A.5.4) 


:  -  c.  ..  v  .  .  -  _  - 


Zk(t)=Sg{Kt]/g(k,ty'~ 


I  11/2 

Fk(t)=  |  [y(M)/N(M)J  -1  \Zk(t) 

ii/2  f  r 

Rk(t)  =  [s(k,0/iV(M)J  ^JVfk.O  —  5<7(k,t)J/?(^’01/2 


(A.5.6 

In  vector 

notation  we  set 

Z(t)  =  [zx(0,  •  ■ 

F{t)  =  [^i(0-  ' 

XR{t)  =  Z{t)  +  F(t)  +  R(t) 

(A.5.7 

To  prove  equation  (A.5.1)  we  show  that,  given  Z  ~  /Vs (OjLjyj 

b’XR{t)  — - >b'Z  V6  e  Ea  .  (A.5.8) 

t  — *oo 


Here,  ^  is  the  covariance  matrix  of  standardized  controls  with  one  of  the 
controls  removed  (more  on  this  below).  To  show  this  we  examine  the 
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•'«/.  %  »V-' 


asymptotic  behavior  of  each  component  of  b'XR(t).  First,  we  consider  the 
component  b'Z(t).  We  compute  the  moment  generating  function  of  b'Z(t): 


Mb'Z(t){&)  =  &  exp  \  9b  'Z(t ) 


=  E  exp  j  [Ob  )'Z(t ) 


=  M  z{t)(bd) 


(A.5.9) 


Now  the  multivariate  central  limit  theorem  (Neuts  (1973),  pg  287)  insures 


W,(0,sfe-1>),  |Efo«!  >0  (A.5.10) 


In  the  next  two  paragraphs  we  demonstrate  that  |  ^  |  >0. 

First,  notice  that  if  X  is  a  pxl  vector  of  random  variables  with  positive 
definite  covariance  matrix  then  if  we  transform  X  as 


Z  =  diag(<7j  *,  •  •  •  ,ak  *)  X 


(A.5.11) 


covjZ  =  diag(o-j  *,  •  •  •  ,<7k  l)  covjX]  diag((T1  l,  •  •  ■  ,ak  l)  (A.5.12) 


w: 


>vv 


co\[Z\  —  |  [ |  rrt  ‘  |  cov  X ,  >  0  , 


(A.5.13) 


hence  the  covariance  matrix  of  the  standardized  random  varibles  is  also 
positive  definite. 


For  notational  convenience  let  N(t )  =  N,  V  U  ;  (k)  =  7Vk,  and 

Ul 


N  =  Nv  ■  ■  ■  ,NS  then 


XNk  =  N 
1 


V  i).  I  _ 


(A.5.14) 


such  that 


Sjvjv^l  =  0  -+3  Xj,  •  •  •  ,\3  with  some  X,  ^0  (A.5.15) 


V  XkATk  =  0  w.p.  1 


(A.5.16) 


If  we  add  (A.5.14)  and  (A.5.16)  we  get 


£  (l+XjjA/k  +  Ns  =  N  w.p.  1 

k- 1 


Now  for  all  k,  1  <  k  <  s  — 1 


(A.5.17) 


msgm  wsem 
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Pk(*)  >  0  -*  Pr  \*\tNj=0,j±s  >  0  (A.5.18) 


— »(1  +\)N  =  N  (A. 5. 19) 

— ►  Xk  =  0  ,  (A.5.20) 

and  since  p k(  * )  >0  ,  V  k  we  have  a  contradiction.  Therefore,  the 
covariance  matrix  of  the  non-standardized  (hence  the  standardized)  controls 
is  positive  definite.  Now  >  0  validates  (A. 5. 10)  and  this  implies 

lim 

,_ccMZ(<>W=MzW 

'  ' 

{9b )' Z 


=  E 


exp ' 


=  E 


exp 


\Sb'Z\ 


=  M6.z(0) 


(A.5.21) 


■  '/V’-y-.'V',-  v"V  '■  /'S’  z 


and  equation  (A.5.8)  follows.  Using  nearly  identical  arguments  to  Wilson  and 
Pritsker,  we  can  show  that 


b'F(t) 


D 


t  — K5C 


•0  ,  b'R{t) 


D 


t  — *-oc 


0  . 


(A.5.22) 


To  finish  the  proof  we  apply  Slutsky’s  theorem  twice  and  invoke  the 
Cramer-Wold  theorem  (implied  in  equation  (A.5.8)). 


■  ■  1  >  II «  9  ■  ■  W  II  F  V  ■  V  •  Vm  V  U  »  \!  ■  fmi  u  ■  i  I  ^1 !  V  *.H  ■."  W*  H.  <  1  ■*  J  M 
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Appendix  6:  FORTRAN  Listings  of  SLAM  Models 


■^1  MJHI  W  * 
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□ 


c  program  main(input,output,tape5=  input,tape6=output,tape7,  lapel , 
c  Itape2,tape3,tape4) 
program  main 
dimension  nset(5000) 
common  qset(5000) 

common/scoml /  atrib(lOO),dd(lOO),ddl(lOO),dtnow,ii,mfa,mstop,nclnr 
1, ncrdr, nprnt,nnrun,nnset,nt  ape, ss(  100),ssl(  100),tnext,tnow,xx(l00) 
common/ucoml /  departf5brmean(4),p(4,4),servt(4),ecount(2) 
equivalence  (nset(l),qset(lj) 
nnset=5000 
ncrdr=5 
nprnt=6 
ntape  =  7 

read  (ncrdr,*)  (rmean(i),i=  1,4) 
do  12  i=  1,4 

read  (ncrdr,*)  (p(i,j),j  =  1,4) 

12  continue 

call  slam 

stop 

end 


subroutine  event(i) 

common/scoml  /  atrib(lOO),dd(lOO),dd](lOO),dtnow,ii,mfa,mstop,ncln 
1,  ncrdr,  nprnt,nnrun,nnset,ntape,ssf  100),  ssl(l00),tnext,tnow,xx(  100) 
common/ucoml/  depart(5),rmean(4),p(4,4),servt(4),ecount(2) 

ecount(l)=ecount(l)-t-l 
if(tnow.gt.2000)  ecount(2)=ecount(2)  — 1 

goto  (l,2),i 

call  arss 
return 
call  endss 
return 


subroutine  intlc 

common/scoml  /  atrib(l00),dd(100),dd!(100),dtnow,ii,mfa,mstop,nclnr 
1, ncrdr, nprnt,nnrun,nnset,ntape.ss(  100),ssl(100),tnext,tnow,xx(l00) 
common/ucoml  /  depart(5),rmean(4),p(4,4),servt(4),ecount(2) 
common/gcomS/  iised(10),jjbeg,jjclr,mmnit,mmon,nname(5),nncfi, 
&nnday,nnpt,nnprj(5),nnrns,nnstr,nnyr,sseed(10),lseed(10) 
integer  iseed(lOOO) 
common/ucom2/  multino(4) 


if(nnrun.eq.l)  then 
do  1  i=  1,1000 
iseed(i)=(l.e-fl2)*drand(l) 
continue 
endif 

iised(2)=iseed(nnrun) 

x=drand(-2) 

do  4  i=  1,4 
multino(i)=0 
continue 

do  5  i  =  1,2 
ecount(i)=0. 
continue 


do  6  i=  1,5 
depart(i)  =  0. 
continue 

do  7  i=  1,4 
servt(i)  =  0. 
continue 

do  10  i=  1,25 
etime  =  expon(rmean(l),2) 
atrib(li  =  etime 
atrib/3)  =  i 
atrib(4  )=  1 
atrib(5)  =  2 

call  schdl( i,etime,atrib) 
continue 

do  11  i=  1,4 
xx(i)=0. 
continue 
write(6,99)nnrun 

format)  lx, ’SIMULATION  STUDY  IN  PROGRESS  :  RUN  ’.i4,  ’  O 
&1000  RUNS’) 
return 
end 


subroutine  endss 

common /scorn  1  /  atrib(  100),dd(  1 00),dd  1  ( 100),dtnow,ii.mfa,mst,op.neln 
1  .nerd  r.nprnt.nnrun.nnset.n  tape. ss(  100),sslf  100),tnext,tnow,xx(  100) 
common /ucoml /  depart(5),rmean(4),p(4,4),servt(4),ecount(2) 
common/ucom2/  multino(4) 


call  schdl(  1 ,0.,atrib) 
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myq=  atrib(4) 

if(nnq(myq).ne.O)  then 
call  rmove(l,myq,atrib) 
wait= tnow-atrib(2) 
call  colct(wait.myq) 
rra=  rmean(myq) 
service=exponfrm,2) 
atrib(4)=atrib(5) 
iat  =  atrib(4)-4-. 00001 
call  nextguy(iat,inext) 

c  COLLECT  STATISTICS  WHILE  PARKED  AT  CPU 

c 

if(iat.eq.2)  then 

multino(inext)  =  multino(inext)  —  1 
endif 

atrib(5)=  inext 
call  schdl(2, service, atrib) 
if(tnow.gt.2000)  then 
servt(myq)=servt(myq) -‘-service 
depart!  myq)= depart  (myq)-l 
depart(5)=depart(5)~l 
endif 
else 

xx(myq)  =  0. 
endif 

return 

end 

(' 

c 

c 

subroutine  arss 

common /scoml /  atrib(  100), dd(  100).ddl(  100),dtnow,ii,mfa,mstop,nclnr 
1  ,ncrdr,nprnt,nnrun,nnset,ntape,ssf  100),sslf  100),tnext,tnow,xx(l00) 
common /ucoml/  depart(5).rrnean(4),p(4,4),servt(4),ecount(2) 
common/ucom2/  multino(4) 

iat  =  atrib(5) 

if(iat.eq.l)  then 
resp=  tnow-atrib(l) 
cal!  colct(resp,l) 


rm  =  rmean(  1 ) 
service  =  expon(  rm.2 ) 
at  rib(  1  )  =  tnow  -service 
atribf  4  )=  1 
atrib(5)=2 


*  ►  "m  '  •  *  »  '  «•  *  »  "  Ik  1  fc  *  *  "  ►  ‘  *  '  >  "  »  '  «•  ■  •  '  >  •  “  »  “  J*  ^  *•»'  w~ 
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c 

c 

c 


call  schdl(  1, service, atrib) 

if(tnow.gt.2000)  servt(iat)  =  servt(iat)—  service 
else 

if(xx(iat).gt.O.)  then 
atrib(2)=  tnovv 
call  filem(iat, atrib) 
return 
else 

wait=0, 

call  colct(wait,iat) 
rm=  rmean(iat) 
atrib(4)=  iat 
call  nextguy(iat.inext) 

COLLECT  STATISTICS  WHILE  PARKED  AT  CPU 

if(iat.eq.2)  then 

multino(inext)=:  multino(inext)  — 1 
endif 

atrib(5)=inext 
service=expon(rm,2) 
xx(iat)=  1 

call  schdl(2, service, atrib) 

if(tnow.gt.2000)  servt(iat)=servt(iat) ^service 
endif 
endif 

if  (tnow.gt.2000)  then 
departfiat)=dpr'artfiat)+l 
depart(5)  =  depart(5)+l 
endif 

return 
end 

*********:*:**********:*:*************:*:***:******* 
subroutine  nextguy(iat,inext) 

common/ucoml /  depart(5),rmean(4),p(4,4),servt(4),ecount(2) 
cum  =  0. 

u  =  unfrm(0.,l.,2) 

do  10  index=  1,4 
cum=cum^p(iat, index) 
if(u.le.cum)  then 
inext  =  index 
goto  1 1 
else 

continue 


subroutine  otput 

common/scoml /  atrib(100),dd(l00),ddl(100).dtnow,ii,mfa,mstop,ncln 
I,ncrdr,nprnt,nnrun,nnset,ntape,ss(l00),sslfl00),tnext,tnow,xx(100) 
common/ucoml/  depart(5),rmean(4),p(4,4),servt(4),ecount(2) 
common/ucom2/  multino(4) 

write(l,*innrun 
writefl  ,*Vecount(i),i=  1,2) 
writef  1  ,*  Hccavg(i),i=  1,4) 

writef  l,*Vttavg(i),i=2,4) 
writefl,*  Hservt(i),i=  1,4) 
writefl  ,*Vdepart(i),i=  1,5) 
write(l,*)(ffawt(i),i=2,4) 
isum=0 
do  1  i=  1,4 

isum=  isum-rmuitino(i) 
continue 

write(l,*)(multino(i),i—  l,4),isum 


return 
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c  program  main(input,output.tape5  =  i nput, ta pe 6  =  output, tape7,ta pel, 

c  Itape2.tape3, tape-4) 

program  main 
dimension  nset(5000) 
common  qset(5000) 

common/scoml /  atrib(100),dd(100).ddl(  100),dtnow,ii,mfa,mstop,nclnr 
I,ncrdr,nprnt,nnrun,nnset,ntape.ss(l00),ssl(100btnext,tnow,xx(l00) 
common /ucoml /  depart ( 10),rmean(  10),p(  10,10),servt(  10),ecount(2) 
common /ucom2 /  isubcap.nusssn.numcust.tclear.nstudy 
equivalence  (nset(l),qset(l)) 
nnset  =  5000 
ncrdr=  5 
nprnt  =  6 
ntape  =  7 

read  (ncrdr,*)  isubcap,nusssn,numcust,tclear,nstudy 
read  (ncrdr,*)  (rmean(i),i=  l,nusssn~2) 

do  12  i=l,nusssn~2 
read  (ncrdr,*)  (p(i,j).j  =  l.nusssn-2) 

12  continue 

call  slam 

stop 

end 

(' 

(' 

(' 

subroutine  event(i) 

common/scoml /  atrib(  100), dd(  100),ddl(  100),dtnow,ii.mfa,mstop,nclnr 
l,ncrdr,nprnt,nnrun,nnset,ntape,ss(  100),ssl(  100),tnext,tnow,xx(  100) 
common /ucoml /  depart)  10),rmean(  10),p(  10.10),servt(  10),ecount(2) 
common /ucom2/  isubcap,nusssn.numcust,tclear,nstudy 

ecount(  1 1  =  ecount(  1)  —  1 
if(tnow.gt.tclear)  ecount(2)  =  ecount(2)~l 

goto  ( l,2),i 

1  call  arss 
return 

2  call  endss 
return 
end 

c 

c 

subroutine  intlc 

common/scoml  /  atrib(  100),dd(  100),ddl(  100),dtnow,ii,infa,mstop,nclnr 
I,ncrdr,nprnt,nnrun,nnset,ntape,ss(l00),ssl(l00),tnext,tnow,xx(l00) 
common/ucoml /  depart(  10),rmean(  1 0 ) . p(  10. 10),servt(  10),ecount(2) 
common/ucom2 /  isubcap,nusssn,numcust,tclear.nstudy 
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common/gcom5 /  iised(l0),jjbeg,jjclr,mmnit,mmon,nname(5),nncfi, 
&nnday,nnpt,nnprj(5),nnrns,nnstr,nnyr,sseed(l0),lseed(l0) 
common/ucom3/  multino(7) 
integer  iseed(2000) 

if(nnrun.eq.l)  then 
do  1  i=  1,2000 
iseed(i)=(l.e+12)*drand(l) 

1  continue 
endif 

iised(2)= iseed(nnrun) 
x=drand(-2) 

do  4  i=  1,7 
multino(i)=0 

4  continue 

do  5  i=  1,2 
ecount(i)=0. 

5  continue 

do  6  i=  l,nusssn-f-3 
depart(i)=0. 

6  continue 

do  7  i=  l,nusssn-l-2 
servt(i)=0. 

7  continue 

do  10  i=l,numcust 
etime=expon(rmean(l),2) 


atrib 

X 

=  etime 

atrib 

3 

=  i 

atrib 

4 

=  1 

atrib 

is) 

=  2 

call  schdl(l,etime, atrib) 

10  continue 

do  11  i=l,nusssn+2 
xx(i)=0. 

11  continue 
write(6,99)nnrun,nstudy 

99  format(lx, ’SIMULATION  STUDY  IN  PROGRESS  :  RUN  ’,i4,  ’  OF 
&’i4,’  RUNS’) 
return 
end 

c 

c 

subroutine  endss 

common/scoml /  atrib(l00),dd(l00),ddl(l00),dtnow,ii,mfa,mstop,nclnr 
l,ncrdr,nprnt,nnrun,nnset,ntape,ss(lOO),ssl(  100),tnext,tnow,xx(l00) 


common/ucoml /  depart(l0),rmean(l0),p(l0,10),servt(10),ecount(2) 
common/ucom2/  isubcap,nusssn,numcust,telear,nstudy 
common/ucom3/  multino(7) 

call  schdl(l,0.,atrib) 
myq=atrib(4) 

if(nnq(myq).ne.O)  then 
call  rmove(l,myq,atrib) 
wait=tnow-atrib(2) 
call  colct(wait,myq) 
rm=  rmean(myq) 
service=exponfrm,2) 
atrib(4)= atrib(5) 
iat=atrib(4)+.00001 
call  nextguy(iat,inext) 

COLLECT  STATISTICS  WHILE  PARKED  AT  CPU 

if(iat.eq.3)  then 

multino(inext)=  multino(inext)-bI 
endif 

atrib(5)— inext 
call  schdl(2, service, atrib) 
if(tnow.gt.tclear)  then 
se  rvt  ( my  q) = se  r  vt  ( my  q)  +se  rvic  e 
departfmyq)=depart(myq)+l 
depart(nusssn+3)=depart(nusssn-f3)-fl 
endif 
else 

xx(myq)=0. 

endif 

if(myq.eq.3.and.nnq(2).gt.0.and.isubcap.ne.0.and.inext.eq.l 
&.and.nnq(myq).ne.O)  then 
call  rmove(l,2,atrib) 
service=0. 
atribf4)=atrib(5) 
atrib(5)=3 

call  schdl(l, service, atrib) 
endif 

return 

end 

**  *************************  ****************■***********.*****  *****+■********** 
subroutine  arss 

common/scoml/  atrib(l00),dd(l00),ddl(100),dtnow,ii,mfa,mstop,nclnr 
l,ncrdr,nprnt,nnrun,nnset,ntape,ss(lOO),ssl(lOO),tnext,tnow,xx(lOO) 
common/ucoml/  depart(l0),rmean(l0),p(l0,10),servt(l0),ecount(2) 


common/ucom2 /  isubcap,nusssn,numcust,tclear,nstudy 
common/ucom3/  multino(T) 

iat  =  atrib(5) 

if(iat.eq.l)  then 
resp  =  tnow-atrib(l) 
call  colct(resp,l) 
rm=  rmean(l) 
service=expon(rm,2) 
atribf  lWtnow-f  service 
atrib(4|=  1 
atrib(5)=2 

call  schdl(l, service, atrib) 

if(tnow.gt.tclear)  servt(iat)=servt(iat) ^service 
go  to  101 
endif 

if(iat.eq.2)  then 
if(isubcap.ne.O)  then 
numsub=0 
do  10  i=3,nusssn+2 
numsub=  numsub+nnq(i)-i-xx(i) 
continue 

if(numsub.lt.isubcap)  then 
if(nnq(2).eq.O)  then 
wait=0. 
call  colct(wait,2) 
service=0. 
atrib(4)=2 
atrib(5)=3 

call  schdl(l, service, atrib) 
go  to  101 
else 

atrib(2)=tnow 
call  filem(2, atrib) 
call  rmove(l,2,atrib) 
wait=  tnow-atrib(2) 
call  colct(wait,2) 
atrib(4)=2 
atrib(5)=3 
service=0. 

call  schdl(l, service, atrib) 
go  to  101 
endif 
else 

atrib(2)=tnow 
call  filem(2, atrib) 
return 
endif 
endif 
endif 


■V-r- 


100  if(xx(iat).gt.O.)  then 
atrib(2)  —  tnow 
call  filem(iat,atrib) 
return 

else 

wait=0. 

call  colct(wait,iat) 
rm=  rmean(iat) 
atrib(4)=iat 
call  nextguy(iat,inext) 
c 

c  COLLECT  STATISTICS  WHILE  PARKED  AT  CPU 
c 

if(iat.eq.3)  then 

multino(inext)=multino(inext)+l 

endif 

atrib(5)=inext 
service=expon(rm,2) 
xx(iat)=  1 

call  schdl(2, service, atrib) 

if(tnow.gt.tclear)  servt(iat)= servt(iat)-rservice 
endif 

101  if  (tnow.gt.tclear)  then 
depart(iat)=depart(iat)+l 
depart(nusssn+3)=depart(nusssn-f3)+l 

endif 

return 

end 

c 

^a************************************************************************** 

C 

subroutine  nextguy(iat,inext) 

common/ucoral /  depart(l0),rmean(l0),p(l0,10),servt(l0),ecount(2) 
common/ucom2/  isubcap,nusssn,numcust,tclear,nstudy 

cum=0. 

u=unfrm(0.,l.,2) 

do  10  index=  l,nusssn+2 
cum=  cum+p(iat,  index) 
if(u.le.cum)  then 
inext= index 
goto  11 
else 

continue 

endif 

10  continue 


11  return 
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*********************************.********.***.******************** 
subroutine  otput 

common/scoml /  atrib(100),dd(l00),ddl(l00),dtnow,ii,nifa)mstop)nclnr 
l,ncrdr,nprnt,nnrun,nnset,ntape,ss(lOO),ssl(lOO),tnext,tnow,xx(lOO) 
common/ucoml /  depart(l0),rmean(l0),p(l0,10),servt(l0),ecount(2) 
common/ucom2/  isubcap,nusssn,numcust,tclear,nstudy 
common/ucom3/  multino(7) 

writefl,*)nnrun 
write!  l,*Mecount(i),i=  1,2) 
write!  l,*)(ccavg(i),i=  l,nusssn+2) 
write!  l,*)fttavg(i),i=2,nusssn+2) 
write!  l,*Vservt(i),i=  l,nusssn-f2) 
write(l,*)(depart(i),i=  l,nusssn+3) 
isum=0 
do  1  i=l,7 

isum=  isum-t-multino(i) 

1  continue 

write(l,*)(multino(i),i=l,7),isum 


return 

end 
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c  program  tree(input,output,tape7,tape5=  in  put,  tape6  =  out  put) 


\>> 


k.v. 


'.v. 


m 


This  program  uses  an  "ail  possible  regressions"  approach  to  * 
select  the  best  subset  of  controls  from  a  given  candidate  set.  * 
It  assumes  that  certain  number  of  meta-experiments  have  been 
performed  each  with  the  same  number  of  replications.  Once  the 
optimal  subset  has  been  identified,  a  confidence  region  is  * 
constructed  about  the  mean  vector  for  the  responses.  Coverage 
and  volume  reduction  is  tallyed  and  subsequently  summarized. 


The  program  can  be  run  in  two  modes.  The  user  can  either 
estimate  the  covariance  matrix  of  controls  or  incorporate 
it  directly.  The  program  variable  "iknow"  dictates  which 
option  is  in  effect  (see  code  below).  * 


The  program  can  also  be  run  in  the  "best  m"  regressions  mode. 

(  Currently  only  configured  for  estimated  covariance  matrix  * 
of  controls)  * 

In  other  words  it  will  compute  the  best  m  subsets  of  each  * 
possible  subset  size.  This  can  be  of  interest  if  a  single  set  * 
of  data  is  used.  * 


*  PARAMETERS  TO  BE  INITIALIZED: 

* 


nx  =  #  of  candidate  controls 

ny  =  #  of  responses 

keepers  =  #  of  best  regressions  to  be  kept 
(m  in  "m  best"  as  above) 
numreps  =  #  replications  per  meta  experiment 
meta  =  #  of  meta  experiments 


*  NOTE: 


IN  SUBROUTINE  COVER  :  nx2  AND  ny2  MUST  BE  SET 
TO  nx  and  ny  RESPECTIVELY.  * 

(IN  THE  PARAMETER  STATEMENT) 


******************************************************************** 


program  tree 

parameter  (nx=7,ny=2,nvar= nx+ny,keepers=6,knx=2**nx) 
parameter  (numreps=20,meta=50) 

parameter(nl  =  nvar,n2= numreps, n3=  numreps,  n4  =  l,n5=  l,n6=0) 

parameter(nnl  =  ((nl*(nl-|-l))/2)) 

common  sig,kk,iqq,ip 

character*25  title, respons(ny),control(nx) 

real  a(nvar,nvar,nvar) 

real  wkarea(ny),rss(ny,ny),dum(ny) 

integer  nk(nvar) 

dimension  x(n3,nl),nbr(6),temp(nl),xm(nl) 


A  A 


^vAv:v:v. 


& 


16-4 


real  vcv((nl*(nl  —  l))/2),full(  nl.nl) 
real  regr(keepers,nx,2),buff(keepers).bulI2(  keepers) 
real  ff(0:nx) 
external  f 

integer  models(knx,nx),ibuff(nx) 

integer  ih(nl) 

integer  icover(4),ictot(4) 

real  vecybar(ny),vr(2),volred(2),coverag(4) 

real  vecmuy(ny),vecmuc(nx),ybar(ny),cbar(nx),veccbar(nx) 

real  covcv(nx,nx) 

real  target(ny,ny) 

data  vecmuc/0.,0.,0.,0.,0.,0.,0./ 
data  vecybar  /78. 31305, .4132402/ 
data  vecmuy  /81.71,.413/ 

data  veccbar/-2.169668E-02, -1.41 694  lE-02,5.544987E-02, 

&  -1.809913E-02 

&  ,3.908565E-02,-1.957430E-02,3.610350E-03  / 

data  title /’MODEL5-.TRANSF ORMED  ’/ 
data  respons /’SYSTEM  RESPONSE  TIME’, 

&  ’CPU  UTILIZATION  ’/ 

data  control /’ROUTING  VARIABLE  (l)\ 

&  ’ROUTING  VARIABLE  (3)’, 

&  ’ROUTING  VARIABLE  (4)’, 

&  ’WORK  VARIABLE  (1)’, 

&  ’WORK  VARIABLE  2  ’, 

&  ’WORK  VARIABLE  3  ’, 

&  ’WORK  VARIABLE  (4)’/ 

open  (unit—  l,file=  ’out.model.5’,status=  ’new’) 
write(l,3l)  title, meta,numreps,meta*numreps 

31  format(lx,a25,’meta  =  ’,13,’  numreps  =  ’ ,i3, ’  total  reps  =  ’, 

&i4) 

write(l,32)meta*numreps 

32  format(lx,’the  response  are’, 13x, ’mean  ’,14,’  reps’, 2x, 

&’steady  state  mean’/) 

do  33  i=  l,ny 

write(l,34)i,respons(i),vecybar(i),vecmuy(i) 

34  format(2x,i2,lx,a25,fl2.5,4x,fl2.5) 

33  continue 
write(l,35) 

35  format(’  ’) 
write(l,36)meta*numreps 

36  format(lx,’the  candidate  controls  are’, 3x, ’mean  ’,i4,’  reps’, 

&2x, 'steady  state  mean’/) 

do  37  i=l,nx 

write(l,34)i,control(i),veccbar(i),vecmuc(i) 

37  continue 
write(l,35) 

c  iknow  IS  THE  FLAG  FOR  USE  OF  THE  KNOWN  COVARIANCE  MATRIX 
c  OF  CONTROLS 


c  iknow  =  1  known  cov  used 

c  iknow  =  0  cov  estimated 

c 

iknow=0 

if(iknow.eq.O)  then 
write(l,38) 
else 

write(l,39) 

endif 

38  format(/, lx, ’COVARIANCE  MATRIX  OF  CONTROLS  WAS  ESTIMATED’) 

39  format(/,lx, ’KNOWN  COVARIANCE  MATRIX  OF  CONTROLS  WAS  L'SED') 
c 

c  HERE  WE  READ  THE  KNOWN  COVARIANCE  STRUCTURE  OF  CONTROLS 
c  (IF  REQUIRED) 

c 

if(iknow.eq.l)  then 

open(unit=3,file=  ’cov.model.5’,status=  ’old’) 
rewind  3 
do  21  i=l,nx 

read(3,*)(covcv(i,j),j=  l,nx) 

21  continue 
endif 

nbr(l)=nl 

nbr(2)=n2 

nbr(3|=n3 

nbr(4)=n4 

nbr(5)=n5 

nbr(6)=n6 

ix=n3 

sig=.90 

c 

c  MAKE  THE  F  TABLE 
c 

ip=ny 
kk=numreps 
call  ftabl(ff,nx) 
print  *,’f  table’, ff 

c 

c  iwrite  =  0  Meta  Experiment  mode 
c  iwrite  =  1  Best  m  Regressions  mode 
c  (m=keepers  above)(meta  =  1) 

iwrite  =  0 

c 

c  INITIALIZE  COVERAGE  AND  VOLUME  REDUCTION  ACCUMULATORS 
c 


ic  tot  ( i  z )  =  0 
''50  continue 

do  851  iz—  1,2 
vr(iz)=0. 

851  continue 
c 

c  THIS  IS  THE  META  EXPERIMENT  LOOP 

c 

do  1000  mm=l,meta 

<: 

c  INITIALIZE  ARRAYS 

c 

numreg=0 
do  999  iz  =  1, keepers 
do  999  jz=l,nx 
do  999  kz  =  l,2 
regr(iz,jz,kz)=0. 

999  continue 

do  998  iz=  l,knx 
do  998  jz=  l,nx 
models(iz,jz)=0 
998  continue 

do  997  iz— l,nvar 
do  997  jz=l,nvar 
do  997  kz=l,nvar 
a(iz,jz,kz)=0. 

997  continue 

do  996  iz=  1, keepers 
buff(iz)=0 
buff2(iz)=0 
996  continue 

do  995  iz=  l,nx 
ibuff(iz)=0 
995  continue 

c 

c  READ  THE  DATA  (each  record  =>  [controlsjresponses 
c  COMPUTE  THE  COVARIANCE  MATRIX 
c  SAVE  SAMPLE  MEANS 
c  BOUND  THE  GENERALIZED  VARIANCE 
c 

do  10  i=  l,n2 
read(5,*)(x(i,j),j=  l,nl) 

10  continue 

call  becovm(x,ix,nbr,temp,xm,vcv,ier) 
do  13  i=  l,nx 
cbar(i)=xm(i) 

13  continue 


do  1-1  i=  nx~T.nl 
vbar(i-nx)=xm(i) 

11  continue 

call  vcvtsf(vcv, nl.full.nl) 
do  1 1  i  =  l.nl 
do  11  j=  l.nl 
a(l,i,j)=full(i,j) 

11  continue 
77  is  =  1 

do  99  ii=  l,ny 
do  99  jj  =  l,ny 
if(jj.gt.ii)  then 
rssni,jj|=a(is,nx-eii,nx~jj) 
rss(jj,ii)=  rss(ii.jj) 
else 

if(ii.eq.jj)  then 
rss(ii,jj)=a(is,nx-ii,nx~jj) 
endif 
endif 

99  continue 
iopt  =  5 

call  Iinv3f(rss,dum,4,ny,ny,dl,d2,wkarea.ier) 
if(ier.ne.O)print  *,"EDIED  BELOW  99" 
det=dl*2**d2 


big=(float(numreps-l)/float(numreps-nx-2))**nv 

two=2*big*dl*2**d2 


STUFF  THE  BOOKKEEPING  ARRAY  WITH  THE  BOUxND 


do  200  ii=l, keepers 
do  200  jj  =  l,nx 
regr(ii,jj,l)=  two 
200  continue 

c 

c  CONDUCT  A  BINARY  SEARCH  OF  THE  REGRESSION  TREE 
c  FURNIYAL  AND  WILSON  (1974) 


3 

4 


y;y. 


k=nx 
do  1  1=1, k 
nk(l)=0 
continue 
nk(k+l)=  1 
1=1 

nk(l)=  1 
do  3  m=l,k 

if(nk(m+l).eq.l)  go  to  4 
continue 

call  gauss(k-m+l,k-l+2,k-I+l,a,nvar,nvar) 


CALCULATION  OF  THE  GENERALIZED  RESIDUAL  COVARIANCE 


,r,v  /-A.'-’.'-'U’-  •  •  '  /V'/' 
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do  100  ii  =  l,nv 
do  100  jj=  l,ny 
if(jj.gt.ii)  then 
rssliLjjW  a(is,nx-ii,nx-jj ) 
rss(jj,ii)=rss(ii,jj) 
else 

if(ii.eq.jj)  then 
rss(ii,jj)  =  a(is,nx—  ii,nx-jj) 
endif 
endif 
continue 
iopt  — 5 


if(iknow.eq.O)  then 

call  Iinv3f(rss,dum,4,nv,nv,dl,d2,wkarea,ier) 
if(ier.ne.O)print  V’IDIED  BELOW  100" 
det=dl*2**d2 
endif 


c  BOOKKEEPING  LOGIC  TO  SAVE  M= KEEPERS  BEST  REGRESSIONS 
c  OF  ALL  J  SUBSETS  SIZES 
c 

mv=o 

do  300  n=l,nx 
mv=mv-rnk(n) 

300  continue 

if(iknow.eq.O)  then 

const=(float(numreps-l)/float(numreps-mv-l)) 
det— det*const**ny 
else 

call  covknow(rss,ny,full,nvar,target,dum,numreps,mv,det) 
endif 

do  301  j  =  l, keepers 
if(detdt.regr(j,mv,l))  then 
numreg— numreg-f-1 
do  302  jj=j,keepers-l 
buff(jj+l)  =  regr(jj,mv,l) 
buff2(jj+l)=regr(jj,mv,2) 

302  continue 
regrfj,mv,l)  =  det 
regr(j,mv,2)= numreg 

do  303  jj=j+l, keepers 
regrfjj,mv,l)=buff(jj) 
regr(jj  ,mv,2)= buff2(jj ) 

303  continue 

call  keepit(numreg, nk,nx, models, knx,nvar) 


go  to  304 
endif 

301  continue 
304  continue 
do  5  1=  l,k 
if(nk(l).eq.O)  go  to  2 
nk(l)  =  0 
5  continue 


c  THIS  BLOCK  IS  FOR  BEST  M  SUBSETS  MODE  OF  OPERATION 
c 

if(iwrite.eq.l)  then 
do  500  i=  l,nx 
write(l,600)  keepers, i 

600  format(l0x,’best  ’,i2,’  regressions  with  ' , i 2 , '  variables'//) 
do  500  j  =  l, keepers 

ivar=0 

iin=0 

do  400  ii=nx,l,-l 
ivar=ivar  +  l 

iffifix(regr(j,i,2)+.000l).eq.0)  go  to  500 
if(models(ifix(regr(j,i,2)-i-.000l),ii).eq.l)  then 
iin= iin-i-1 
ibuff(iin)—  ivar 
endif 

400  continue 

rdet  =  regr(j,i,l) 

write(l,60l)rdet,(ibuff(ij),ij  =  l,iin) 

601  format(lx,el6.8,10x,30(i2,lx)) 

500  continue 

endif 


c  FOR  EACH  SUBSET  COMPUTE  THE  CRITERION  AND  SAVE  THE  MINIMUM 
c 

if(iwrite.eq.O)  then 
ip=ny 
kk=numreps 
do  650  iq=  l,nx 

if(iknow.eq.O)  then 

regr(l,iq,l)=regr(l,iq,l)*c3(kk,iq,ip)*cfront(kk,iq,ip)* 

&  ff(iq) 
else 

regr(l,iq,l)=regr(l,iq,l)*c4(kk,iq,ip)*cfront(kk,iq,ip)* 

&  ff(iq) 
endif 

if(iq.eq.l)  rmin=  regr(l,iq,l)-!-1000. 

650  continue 

do  700  iq=  l,nx 
if(regr(l,iq,l).lt.rmin)  then 
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rmin  =  regr(l,iq,l) 
iat  =  regr(l,iq,2) 
endif 

700  continue 
ivar=0 
iin=0 

do  750  ii=nx,l,-l 
ivar=  ivar^l 

if(models(iat,ii).eq.l)  then 
iin=iin-*-l 
ibuff(iin)=ivar 
endif 

750  continue 
sp=rmin 

write(l,60l)sp,(ibuff(ij),ij  =  l,iin) 
c 

c  FIND  THE  VOLUME  REDUCTION  AND  INDICATE  COVERAGE 
c 

call  cover(vcv,  nl,nnl, models, knx,nx,iat,iin,ybar 
&,cbar,vecmuc,ny,vecmuy,numreps,ff,ih,icover,volred,vecybar 
&,iknow,covcv) 

c 

c  COVERAGE  AND  VOLUME  REDUCTION  TALLYS 

c 

do  800  ic=l,4 

ictot(ic)=ictot(ic)+icover(ic) 

800  continue 

do  801  ic=l,2 
vr(ic)=vr(ic)-Fvolred(ic) 

801  continue 

endif 

print  *,"THIS  IS  META-EXPERIMENT  #  ”,mm,"icover  ".icover 

1000  continue 

do  1001  iz  =  1,2 
vr(iz)=vr(iz)/float(meta) 

1001  continue 

do  1002  iz=  1,4 

coverag(iz)=float(ictot(iz))/float(meta) 

1002  continue 

write(l,602)coverag(l),vr(li 

602  format(lx,’contrld  coverage  on  steady  -*  ■  •  • 

&’  vol  reduct  ’,el6.8) 

write(l,603)coverag(2) 

603  format(lx,’uncontrld  coverage 
write(l,604)coverag(3l.vr(  1  1 

604  format(lx,’contrld  coverag* 

Sc'  vol  reduct  \el6A1 

write(l,605)coverag  I 


“*0-8186  637 
UNCLASSIFIED 


CONTROL  VARIATE  SELECTION  FOR  MULTIRESPONSE  SIMULATION  2/2 
(U)  AIR  FORCE  INST  OF  TECH  NRIGHT-PATTERSON  AFB  OH 
K  u  BAUER  MAV  87  AFIT7CI/NR-87-132D 

F/G  12/2  NL 


605  format(lx,’uncontrld  coverage  on  sample  mean  of  1000  reps’,fl2.8) 


stop 

end 

subroutine  gauss(ib,is,ip,a,kp,nvar) 
c 

c  THIS  SUBROUTINE  PERFORMS  THE  PIVOTS  FOR  VARIABLE 
c  INTRODUCTION  INTO  REGRESSION  MODELS 

c  FUR  NIVAL  AND  WILSON  1974 

c 

real  a(nvar,nvar,nvar) 
lb=ip+l 
c 

c  TOLERANCE  CHECK  ON  PIVOTS 

c 

if(a(ib,ip,ip).lt..Ol)  then 
do  10  l=lb,kp 
a(is,3p,l)=a(ib,ip,l) 
do  10  m=l,kp 
a(is,l,m)=a(ib,I,m) 

10  continue 
return 
else 

do  1  l=lb,kp 

a(is,ip,l)=a(ib,ip,l)/a(ib,ip,ip) 
do  1  m=l,kp 

a(is,l,m)— a(ib,l,m)-a(ib,ip,m)*a(is,ip,l) 

1  continue 
return 
endif 

end 


subroutine  keepit(numreg,nk,nx, models, knx,nvar) 
c 

c  THIS  SUBROUTINE  FINDS  THE  MODEL  OF  A  CANDIDATE  REGRESSION 
c 

integer  nk(nvar),models(knx,nx) 
do  1  i=l,nx 
models(numreg,i)=nk(i) 

1  continue 
return 
end 
c 

c  THE  FOLLOWING  FUNCTIONS  ARE  USED  TO  COMPUTE  THE  SELECTION 
c  CRITERION 
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real  function  c3(k,iq,ip) 

c3=cl(k,iq,ip)*c2(k,iq,ip) 

return 

end 

real  function  c4(k,iq,ip) 
prod=l. 
do  10  i=l,ip 
top=float(k-iq-i) 
bot=  float(k-iq-l) 
prod= prod*(top  /bot) 

10  continue 
c4=prod 
return 
end 

real  function  cfront(k,iq,ip) 

top—  floatfk-iq-l) 

bot= float(k-iq-ip) 

cfront=  (top  /bot)**ip 

return 

end 

real  function  cl(k,iq,ip) 
prod=l. 
do  10  i=l,ip 
itop=  (k-iq-i) 
ibot=(k-iq-l)*k 
term=  floatfitop)  /float(ibot) 
prod=prod^term 
10  continue 
cl=prod 
return 
end 

* 

real  function  c2(k,iq,ip) 
sum=0. 

Pl=l- 

p2=  1. 

do  10  j  =  0,ip 
ileft = j  comb(ip  ,j ) 
if(j.ne.O)  then 
pl=pl*(iq+2*(j-l)) 
p2=p2*(k-iq-(2*j)) 
rnext=pl/p2 
else 

rnext=  1. 
endif 

term=  float(ileft)*rnext 
sum=sum+term 
10  continue 
c2=sum 


return 

end 


integer  function  jcomb(n,m) 

itop=nfactfn) 

ibot=nfact(n-m)*nfact(m) 

j  comb = itop  /ibot 

return 

end 

integer  function  nfact(m) 
if(m.eq.O)then 
nfact=  1 
return 
endif 
ip=m 
iloop— m-1 
do  10  i=  iloop, 2,-1 
ip— ip*i 
10  continue 
nfact=ip 
return 
end 
c 

c  THIS  SUBROUTINE  COMPUTES  A  F  TABLE  (TO  THE  POWER  P) 
c 

subroutine  ftabl(fif,nx) 

common  sig,kk,iqq,ip 

real  root(l),last,ff(0:nx) 

external  f 

eps=.001 

nsig=5 

nroot=l 

itmax=  1000 

last=3. 

do  10  iqq=0,nx 
root(l)=last 

101  call  zreal2(f,eps,eps,eps,nsig,nroot,root,itmax,ier) 
if(ier.eq.33)  then 

root(l)=last-l-l. 

ier=0 

write(8,102) 

102  format( lx, ’ignore  last  ier=33  warning  —  reinitializing’) 
go  to  101 

endif 

last=root(l) 
fp=root(l)**ip 
ff(iqq)=fp 
10  continue 
return 
end 


real  function  f(z) 


common  sig,kk,iqq,ip 
nl  =  ip 

n2=kk-iqq-ip 

call  mdfd(z,nl,n2,p,ier) 

f=sig-p 

return 

end 

subroutine  cover(vcv, nl,nnl, models, knx,nx,iat,iin,ybar 
&,cbar,vecmuc,ny,vecmuy,numreps,ff,ih,icover,volred,vecybar 
&,iknow,covcv) 
c 

c  THIS  SUBROUTINE  DOES  THE  COVERAGE  AND  VOLUME  REDUCTION  CALC 
c  FOR  THE  OPTIMAL  CONTROL  SUBSET 

parameter(nx2=7,ny2=2,nl2=nx2+ny2,nnl2=((nl2*(nl2+l))/2)) 

real  vcv(nnl),ybar(ny),cbar(nx),vecmuc(nx),vecmuy(ny) 
&,ff(0:nx),vecybar(ny),volred(2),covcv(nx,nx) 
integer  models(knx,nx),ih(nl).icover(4) 

real  scbar(nx2),svecmu(nx2),subv(nnl2),subvf(nl2,nl2) 
&,b(nl2),wkarea(2*nl2),buffl(nl2,nl2),buff2(ny2,nx2) 
&,beta(ny2,nx2),cdevl(l,nx2),cdev2(nx2,l),expl(ny2,ny2) 
&,dev(ny2,l),ybhat(ny2),buffi>(nx2,ny2) 
&,buff4(ny2,ny2),sydotc(ny2,ny2),hph(l,l),tl(l,nx2) 
&,ymdl(l,ny2),ymd2(ny2,l),t2fl,ny2),obs(l,l) 
&,bufif5(ny2,ny2),bufif6(ny2,ny2),ymd3(l,ny2),ymd4(ny2,l) 

&,obs2(l,l) 

&,symcovc((nx2*(nx2-t-l))/2’/,subcovc((nx2*(nx2+l))/2) 

&,fulcovc(nx2,nx2),gamma(ny2,nx2) 

&,ehat(ny2,ny2),buff9(ny2,ny2) 

&tcancorr(ny2,ny2),reigs(ny2),eigs(2*ny2),dummy(ny2,ny2) 

&,wk(ny2) 
integer  ih2(nx2) 
complex  ceigs(ny2) 
equivalence  (eigs(l),ceigs(l)) 


INITIALIZE  COVERAGE  AND  VOLUME  REDUCTION  VECTORS 

do  8  i=  1,4 
icover(i)=0 
continue 
do  9  i—  1,2 
volred(i)=0. 
continue 


FIND  THE  SUBMATRIX  FOR  THE  SELECTED  MODEL 
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do  10  i=l,nl 
if(i.le.nx)  then 
ih(i)=0 
ih2(i)=0 
else 

•4)=! 

endif 

10  continue 
ivar=0 

do  50  ii=nx,l,-l 
ivar=ivar+l 

if(models(iat,ii).eq.l)  then 
ih(ivar)=l 
ih2(ivar)=  1 
endif 

50  continue 
ml=nl 

call  rlsubm(vcv,ml,ih,subv,m2) 


FIND  THE  SUBVECTOR  (POPULATION  AND  SAMPLE)  OF  THE 
CONTROL  MEANS 


index=0 
do  100  ii=l,nx 
if(ih(ii).eq.l)  then 
index= index -fl 
scbar(index)= cbar(ii) 
svecmu(index)=vecmuc(ii) 
endif 

100  continue 


BUFFER  THE  COVARIANCE  MATRIX  OF  SELECTED  CONTROLS 
AND  RESPONSES 


call  vcvtsf(subv,m2,subvf,nl2) 

do  101  i=l,m2 
do  101  j  =  l,m2 

buffi  (i,j)=subvf(i,j) 

101  continue 


INVERT  THE  COVARIANCE  SUBMATRIX  OF  CONTROLS 


call  Iinv3f(subvf,b,l,iin,nl2,dl,d2,wkarea,ier) 
if(ier.ne.0)print  V’lDIFD  BELOW  101" 
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c  BUFFER  THE  CROSS-COVARIANCE  SUBMATRICES  OF  SELECTED  CONTR 
c  WITH  RESPONSES 

c 

do  102  i=iin+l,m2 
do  102  j=l,iin 

buff2(i-iin,j)=buffl(i,j) 

buff3(j,i-im)=buffl(i,j) 

102  continue 


BUFFER  THE  COVARIANCE  SUBMATRIX  OF  RESPONSES 


do  105  i=iin-l-l,m2 
do  105  j  =  iin+l,m2 
buff4n-iin,j-iinl=bufflfi,j) 
buff6(i-iin,j-iin)— buffl(i,j) 
105  continue 


FIND  THE  BETA  HAT  MATRIX  (  CONTROL  COEFFICIENTS  ) 
OR  THE  GAMMA  HAT  MATRIX 


if(iknow.eq.O)  then 

call  vmulff(buff2,subvf,ny2,iin,iin,ny2,nl2,beta,ny2,ier) 
else 

call  vmulff(buff2,subvf,ny2,iin,iin,ny2,nl2,beta,ny2,ier) 

call  vcvtfs(covcv,nx2,nx,symcovc) 

call  rlsubm(symcovc,nx2,ih2,subcovc,iorder) 

call  vcvtsf(subcovc,iorder,fulcovc,nx2) 

call  linv3f(fulcovc,b,l, iin,nx2, dl,d2,wkarea,ier) 

call  vmulff(buff2,fulcovc,ny2,iin,iin,ny2,nx2, gamma, ny2,ier) 

endif 


FIND  THE  VECTOR  OF  CORRECTIONS  TO  CONTROL  Y  BAR 


do  103  i=l,iin 

cdevlfl,n=scbar(i)-svecmu(i) 

cdev2(i,l)=cdevl(l,i) 

103  continue 


if(iknow.eq.O)  then 

call  vmulff( beta, cdev2,ny2,iin,l,ny2,nx2, dev, ny2,ier) 


call  vmu!ff(gamma,cdev2,ny2,iin,l,ny2,nx2,dev,ny2,ier) 
endif 

c 

c  FIND  THE  CONTROLLED  ESTIMATOR  OF  THE  MEAN 
c 

do  104  i=l,ny2 

ybhat(i)=ybar(i)-dev(i,l) 

104  continue 


c 

c  FIND  THE  MATRIX  OF  EXPLAINED  COVARIANCE  DUE  TO 
c  CONTROL 

c 

call  vmulff(beta,bufF3,ny2,iin,ny2,ny2,nx2,expl,ny2,ier) 

c 

c  FIND  THE  RESIDUAL  COVARIANCE 

c 

cl  =  (float(numreps-l)/float(numreps-iin-l)) 

do  106  i=l,ny2 
do  106  j  =  l,ny2 

sydotc(ij)==(buff4(i,j)-expl(i,j))*cl 
buff5(i,j)  =sydotc(i,j) 

106  continue 

c  FIND  THE  ESTIMATOR  SIGMA  TILDE  HAT 

c 

if(iknow.eq.l)  then 

const  l  =  (floatfnumreps-2))/(float(numreps*(numreps-l))) 
const2=(float(iin+l))/(float(numreps*(numreps-l))) 
do  206  i=l,ny2 
do  206  j=  l,ny2 

ehat(i,j)=(constl*sydotc(i,j))+(const2*buff4(i,j)) 

bu£F9(i,j)=ehat(i,j) 

206  continue 

endif 

c 

c  FIND  THE  INVERSE  RESIDUAL  COVARIANCE  MATRIX 

c 

if(iknow.eq.O)  then 

call  linv3f(sydotc,b,l7iiy2,ny2,dl,d2,wkarea,ier) 
else 

call  Iinv3f(ehat,b,l,ny2,ny2,dl,d2,wkarea,ier) 
endif 


c  COMPUTE  THE  DEVIATIONS  FROM  THE  STEADY-STATE 

c  RESPONSE  VECTOR 

c  (both  cases:  controlled/uncontrolled) 

c 

do  107  i=l,ny2 
ymdl(l,i)=ybhat(i)-vecmuy(i) 
ymd2fi,lj=ymdl(l,i) 
ymd3f  l,n=ybarm-vecmuy(i) 
ymd4(i,l)=yTnd3(l,i) 

107  continue 

c 

c  COMPUTE  H’H 

c  (Notation  as  per  Venkatraman  and  Wilson  1986) 
c 

if(iknow.eq.O)  then 

call  vmulfffcdevl,subvf,l,iin,iin,l,nl2,tl,l,ier) 
call  vmulff(tl,cdev2,l)iin)l> l,nx2, hph,l,ier) 
endif 

if(iknow.eq.O)  then 

x=(l./float(numreps))+(l./float(numreps-l))*hph(l,l) 
else 
x—  1. 
endif 

c 

c  COMPUTE  THE  RIGHT  HAND  SIDE 

c  FOR  THE  CONFIDENCE  REGION 

c  AS  PER  RAO  (1967) 

c 

c2=(float((numreps-iin-l)*ny2)/float(numreps-iin-ny2)) 

f=  exp((l./float(ny2))*alog(ff(iin))) 

rhs=x*c2*f 

c 

c  COMPUTE  THE  T**2  STATISTIC 

c  FOR  THE  CASE  WHERE  CONTROLS  ARE  USED 

c  (steady  state  assumed) 

c 

if(iknow.eq.O)  then 

call  vmulfffymdl,sydotc,l,ny2,ny2,l»ny2,t2,l,ier) 
call  vmulff(t2,ymd2,l,ny2,l,l,ny2,obs,l,ier) 
else 

call  vmulffTymdl,ehat,l,ny2,ny2,l,ny2,t2,l,ier) 
call  vmulff(t2,ymd2,l,ny2,lrliny2,obs,l,ier) 
endif 

c 

c  INDICATE  COVERAGE 

c  FOR  THE  CASE  WHERE  CONTROLS  ARE  USED 

c  (steady  state  assumed) 
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if(obs(l,l).le.rhs)  then 
icover(lj=l 
else 

icover(l)=0 

endif 


COMPUTE  THE  VOLUME  REDUCTION 
if(iknow.eq.O)  then 

call  nnv3f(buff4,b,4,ny2,ny2,dl,d2,wkarea,ier) 
ucdet=dl*2**d2 

call  rmv3f(buff5,b,4,ny2,ny2,dl,d2,wkarea,ier) 
cdet=dl*2**d2 
else 

call  Iinv3f{buff4,b,4,ny2,ny2,dl,d2,wkarea,ier) 
ucdet=dl*2**d2  ' 

call  Iinv3f(buff9,b,4,ny2,ny2,dl,d2,wkarea,ier) 
cdet=dl*2**d2 
endif 

terml=(cdet/ucdet)**(.5)*x**(float(ny2)/2.) 

c3=float(fnumreps-iin-l)*(numreps)*(numreps-ny2)) 

c4=float((numreps-iin-ny2)*(numreps-l)) 

term2=(c3/c4)**(float(ny2)/2.) 

f2=exp((l./float(ny2))*alog(ff(0))) 

term3=(f/f2)**(float(ny2)/2.) 

volred(l)=(l.-(terml*term2*term3))*100. 


COMPUTE  THE  T**2  STATISTIC 
FOR  THE  CASE  WHERE  no  CONTROLS  ARE  USED 

call  Iinv3f(buff6,b,l,ny2,ny2,dl,d2,wkarea,ier) 

call  vmulfffymd3, buff6, 1,ny2,ny2,l,ny2,t2,l,ier) 

call  vmulff(t2,ymd4,l,ny2,l,l,ny2,obs2,l,ier) 

COMPUTE  THE  RIGHT  HAND  SIDE 
FOR  THE  CONFIDENCE  REGION 

c5=(float((num.reps-l)*ny2)/float(fnumreps-ny2)*numreps)) 

rhs2=exp((l./float(ny2))*alog(ff(0)j)*c5 


INDICATE  COVERAGE 

FOR  THE  CASE  WHERE  no  CONTROLS  ARE  USED 
(steady  state  assumed) 

if(obs2(l,l).le.rhs2)  then 
icover(2)=  1 
else 

icover(2)=0 
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endif 


THE  REMAINING  ANALYSIS  DUPLICATES  THE  ABOVE  SAVE  THAT 
THE  GRAND  MEAN  OF  1000  RESPONSES  IS  USED 


RECOMPUTE  DEVIATIONS 


do  108  i=l,ny2 
ymdl(l,i)=ybhat(i)-vecybar(i) 
ymd2(i,lj=ynidl(l,i) 
ymd3(l,i]=ybarm-vecybar(i) 
ymd4(i,  1 ) = ymd3(i  ,i) 
continue 


COMPUTE  THE  T**2  STATISTIC 
FOR  THE  CASE  WHERE  CONTROLS  ARE  USED 
(Grand  mean  used) 


if(iknow.eq.0)  then 

call  vmulff(ymdl,sydotc,l)iiy2,ny2,l,ny2,t2,l,ier) 

call  vmulff(t2,ymd2,l,ny2,l,l|Hy2,obs,l,ier) 
else 

call  vrnulff(ymdl,ehat,l,ny2,ny2,l,ny2,t2,l,ier) 

call  vmulff(t2,ymd2,l,ny2,l,l,ny2,obs,l,ier) 
endif 


INDICATE  COVERAGE 

FOR  THE  CASE  WHERE  CONTROLS  ARE  USED 


if(obs(l,l)-le.rhs)  then 
icover(3)=  1 
else 

icover(3)=0 

endif 


COMPUTE  THE  T**2  STATISTIC 
FOR  THE  CASE  WHERE  no  CONTROLS  ARE  USED 
(Grand  mean  used) 


vmulfffymd3,buff6,l,ny2,ny2,l,ny2,t2,l»i 

vmulff(t2,ymd4,l»ny2,ltl>ny2,obs2,l.ier) 


INDICATE  COVERAGE 

FOR  THE  CASE  WHERE  no  CONTROLS  ARE  USED 


if(obs2(l,l)*le.rhs2)  then 
icover(4)=  1 
else 

icover(4)=0 
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endif 

c 

c  THIS  SECTION  COMPUTES  THE  CANONICAL  CORRELATIONS 

c  FOR  THE  SUBSET  MODELS  AND  THE  FEASEBELTY  BOUND 

c  FOR  USING  THE  KNOW  COVARIANCE  MATRIX  OF  CONTROLS 

c 

if(iknow.eq.l)  then 

call  vmulff(buff6,expl,ny2,ny2,ny2,ny2,ny2,cancorr,ny2,ny2,ier) 
call  eigrf(cancorr,ny2,ny2,0,eigs, dummy, ny2,wk,ier) 

icount=0 
do  300  i=  l,ny2 
do  300  j=l,2 
icount=icount+l 

if(j.eq.l)  reigs(i)=sqrt(eigs(icount)) 

300  continue 

ctop=--float((numreps-|-iin-l)*(numreps-iin-2))/ 

&  float((numreps-l)*(numreps-2)) 
cbot= ctop*(float(numreps-2)  / float(numreps+iin- 1 )) 
bound=sqrt{(ctop-l.)/(cbot-l.)) 

print  *, "Canonical  correlations  ",reigs,"  Bound  ", bound 
print  *,eigs 
endif 

return 

end 

c 

c  THIS  SUBROUTINE  RETURNS  THE  GENERALIZED  VARIANCE 

c  OF  SIGMA  TILDE  HAT 

c 

subroutine  covknow(rss, ny, full, nvar, target, dum,numreps,mv,det) 
real  rss(ny,ny),full(nvar,nvar),target(ny,ny),dum(ny) 

cl  =  (floatfnumreps-2)/float(numreps*(numreps-mv-l))) 
c2=(float(mv-)-l)/float(numreps:‘(numreps-l))) 
nx=  nvar-ny 


do  10  i=l,ny 
do  10  j  =  l,ny 

target(i,j)=(cl*rss(i,j))+(c2*full(nx-t-i,nx+j)) 

10  continue 

call  Iinv3f(target,dum,4,ny,ny,dl,d2,wkarea,ier) 

det=dl*2**d2 

return 

end 
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